| J. A |   |
|------|---|
|      | L |
|      |   |

EDIT DATE
4 September, 20152

DOCUMENT-REV. NUM. GEN-CXXXXX-REVA

PAGE 1 of 54

Author:

Laurent Lefebvre

| issue to: | Copy No |
|-----------|---------|
|           | 1       |

# **R400 Sequencer Specification**

# SQ

### Version 2.010

Overview: This is an architectural specification for the R400 Sequencer block (SEQ). It provides an overview of the required capabilities and expected uses of the block. It also describes the block interfaces, internal subblocks, and provides internal state diagrams.

**AUTOMATICALLY UPDATED FIELDS:** 

Document Location: C:\perforce\r400\doc\_lib\design\blocks\sq\R400\_Sequencer.doc

Current Intranet Search Title: R400 Sequencer Specification

|           | APPROVALS      |  |  |  |  |  |  |
|-----------|----------------|--|--|--|--|--|--|
| Name/Dept | Signature/Date |  |  |  |  |  |  |
|           |                |  |  |  |  |  |  |
|           |                |  |  |  |  |  |  |
|           |                |  |  |  |  |  |  |
|           |                |  |  |  |  |  |  |
|           |                |  |  |  |  |  |  |

#### Remarks:

THIS DOCUMENT CONTAINS CONFIDENTIAL INFORMATION THAT COULD BE SUBSTANTIALLY DETRIMENTAL TO THE INTEREST OF ATI TECHNOLOGIES INC. THROUGH UNAUTHORIZED USE OR DISCLOSURE.

"Copyright 2001, ATI Technologies Inc. All rights reserved. The material in this document constitutes an unpublished work created in 2001. The use of this copyright notice is intended to provide notice that ATI owns a copyright in this unpublished work. The copyright notice is not an admission that publication has occurred. This work contains confidential, proprietary information and trade secrets of ATI. No part of this document may be used, reproduced, or transmitted in any form or by any means without the prior written permission of ATI Technologies Inc."

Exhibit 2029.docR400\_Sequencer.doc 73711 Bytes\*\*\* © ATI Confidential. Reference Copyright Notice on Cover Page © \*\*\*

ATI 2029 LG v. ATI IPR2015-00325

AMD1044\_0257395



EDIT DATE
4 September, 20152

R400 Sequencer Specification

PAGE 2 of 54

# Table Of Contents

| 1.                                      | OVERVIEW                                       | 97                       |
|-----------------------------------------|------------------------------------------------|--------------------------|
| 1.1                                     | Top Level Block Diagram                        | 119                      |
| 1.2                                     | Data Flow graph (SP)                           | . 1240                   |
| 1.3                                     | Control Graph                                  | 1311                     |
| 2.                                      | INTERPOLATED DATA BUS                          |                          |
| 3.                                      | INSTRUCTION STORE                              | <u>.1614</u>             |
| 4.                                      | SEQUENCER INSTRUCTIONS                         | .1614                    |
| 5                                       | CONSTANT STORES                                | <u>. 1614</u>            |
| 5.1                                     | Memory organizations.                          | <u>. 1614</u>            |
| 5.2                                     | Management of the Control Flow Constants       | <u>. 17<del>15</del></u> |
| 5.3                                     | Management of the re-mapping tables            | 1745                     |
| *************************************** | 3.1 R400 Constant management                   |                          |
| 5.3                                     | 3.2 Proposal for R400LE constant management    | 1745                     |
| 5.3                                     | 3.3 Dirty bits                                 | . 1947                   |
| 5.3                                     | 3.4 Free List Block                            | 1917                     |
| <u>5.3</u>                              | 3.5 De-allocate Block                          | .2048                    |
| 5.3                                     | 3.6 Operation of Incremental model             |                          |
| 5.4                                     | Constant Store Indexing.                       |                          |
| 5.5                                     | Real Time Commands.                            | .2149                    |
| 5.6                                     | Constant Waterfalling                          | 2119                     |
| <u>6.</u>                               | LOOPING AND BRANCHES                           | . 2220                   |
| 6.1                                     | The controlling state.                         | .2220                    |
| 6.2                                     | The Control Flow Program                       | <u>. 2220</u>            |
|                                         | 2.1 Control flow instructions table            |                          |
| 6.3                                     | Implementation                                 | 2422                     |
| 6.4                                     | Data dependant predicate instructions          | 2624                     |
| 6.5                                     | HW Detection of PV,PS                          | .2724                    |
| 6.6                                     | Register file indexing.                        |                          |
| 6.7                                     | Debugging the Shaders                          |                          |
| <u>6.</u>                               | 7.1 Method 1: Debugging registers              |                          |
|                                         | 7.2 Method 2: Exporting the values in the GPRs |                          |
| <u>7.</u>                               | PIXEL KILL MASK                                | <u>. 2826</u>            |
| 8.                                      | MULTIPASS VERTEX SHADERS (HOS)                 |                          |
| 9.                                      | REGISTER FILE ALLOCATION.                      |                          |
| <u>10.</u>                              | FETCH ARBITRATION                              |                          |
| 11.                                     | ALU ARBITRATION                                | .2927                    |
| <u>12.</u>                              | HANDLING STALLS                                | .3028                    |
| 13.                                     | CONTENT OF THE RESERVATION STATION FIFOS       | 3028                     |
| <u>14.</u>                              | THE OUTPUT FILE                                |                          |
| 15.                                     | IJ FORMAT                                      | 3028                     |
| 15.1                                    | Interpolation of constant attributes           |                          |
| 16.                                     | STAGING REGISTERS                              | .3129                    |

|                                         | ORIGINATE DATE                          | EDIT DATE                                    | DOCUMENT-REV. NUM.                      | PAGE                                    |
|-----------------------------------------|-----------------------------------------|----------------------------------------------|-----------------------------------------|-----------------------------------------|
| <i>-</i>                                | 24 September, 2001                      | 4 September, 20152                           | GEN-CXXXXX-REVA                         | 3 of 54                                 |
| 17. THE P                               | ARAMETER CACHE.                         | May 700710 April 7007                        | *************************************** | 3330                                    |
| 17.1 Expo                               | ort restrictions                        |                                              |                                         | 3430                                    |
| 17.1.1                                  | Pixel exports:                          |                                              | *************************************** | 34 <u>30</u>                            |
| 17.1.2                                  | Vertex exports:                         |                                              |                                         | 3430                                    |
| 17.1.3                                  | Pass thru exports:                      |                                              |                                         | 3430                                    |
|                                         |                                         |                                              |                                         |                                         |
|                                         |                                         |                                              |                                         |                                         |
|                                         |                                         |                                              |                                         |                                         |
| 19. SPEC                                | IAL INTERPOLATION                       | MODES                                        | *************                           | 3534                                    |
| 19.1 Rea                                | time commands                           |                                              | *************************************** | 3531                                    |
|                                         |                                         |                                              |                                         |                                         |
|                                         |                                         |                                              |                                         |                                         |
|                                         |                                         |                                              |                                         | 1                                       |
| 19.3.2                                  |                                         |                                              |                                         |                                         |
| 20. STATI                               | E MANAGEMENT                            | * * * * * * * * * * * * * * * * * * * *      |                                         | 3633                                    |
|                                         |                                         |                                              | *************************************** |                                         |
| 21.1 Vert                               | ex indexes imports                      | *********************                        |                                         | 3733                                    |
|                                         |                                         |                                              | *************************************** |                                         |
| 22.1 Conf                               | trol                                    |                                              |                                         | 3734                                    |
|                                         |                                         |                                              | *************************************** |                                         |
|                                         |                                         |                                              |                                         |                                         |
|                                         |                                         |                                              |                                         |                                         |
| *************************************** |                                         |                                              |                                         |                                         |
|                                         |                                         |                                              |                                         |                                         |
|                                         |                                         |                                              |                                         |                                         |
| 24.2.1                                  |                                         | ***************************************      |                                         |                                         |
| 24.2.2                                  |                                         |                                              |                                         |                                         |
| 24.2.3                                  | *************************************** | ***************************************      |                                         |                                         |
| 24.2.4                                  |                                         |                                              |                                         |                                         |
|                                         |                                         |                                              |                                         |                                         |
| 24.2.5                                  |                                         |                                              |                                         |                                         |
| 24.2.6                                  |                                         |                                              |                                         |                                         |
| 24.2.7                                  | **************************************  |                                              |                                         | *************************************** |
| 24.2.8                                  |                                         |                                              |                                         |                                         |
| 24.2.9                                  | TP to SQ: Texture sta                   | all                                          |                                         | 4642                                    |
| 24.2.10                                 | SQ to SP: Texture sta                   | <u> </u>                                     |                                         | 4742                                    |
| 24.2.11                                 | SQ to SP: GPR and a                     | uto counter                                  |                                         | 4743                                    |
| 24.2.12                                 |                                         |                                              |                                         | 1                                       |
| 24.2.13                                 |                                         |                                              | Set                                     |                                         |
| 24.2.14                                 |                                         |                                              |                                         |                                         |
| hand 3 , hand , 2, "V                   | OG TO OF A. CONSTAINT                   | J. J. G. |                                         |                                         |





| A PA     | ORIGINATE DATE         | EDIT DATE             | R400 Sequencer Specification | PAGE    |
|----------|------------------------|-----------------------|------------------------------|---------|
| 6,000    | 24 September, 2001     | 4 September, 20152    |                              | 6 of 54 |
| 27.2.13  | SP to SQ: Constant     | address load/ Predica | ate Set                      | 44      |
| 27.2.14  | SQ to SPx: constant    | : broadcast           |                              | 45      |
| 27.2.15  | SP0 to SQ: Kill vector | or load               |                              | 45      |
| 27.2.16  | SQ to CP: RBBM bu      | ls                    |                              | 45      |
| 27.2.17  | CP to SQ: RBBM bu      | IS                    |                              | 45      |
| 27.2.18  | SQ to CP: State rep    | ort                   |                              | 45      |
| 28.—OPEN | HSSUES                 | ******************    |                              | 50      |



**EDIT DATE** 4 September, 20152 DOCUMENT-REV. NUM. GEN-CXXXXX-REVA

PAGE 7 of 54

# Revision Changes:

Rev 0.1 (Laurent Lefebvre) Date: May 7, 2001

Rev 0.2 (Laurent Lefebvre) Date: July 9, 2001 Rev 0.3 (Laurent Lefebvre) Date: August 6, 2001 Rev 0.4 (Laurent Lefebvre) Date: August 24, 2001

Rev 0.5 (Laurent Lefebvre) Date: September 7, 2001 Rev 0.6 (Laurent Lefebvre) Date: September 24, 2001 Rev 0.7 (Laurent Lefebvre) Date: October 5, 2001

Rev 0.8 (Laurent Lefebvre) Date: October 8, 2001 Rev 0.9 (Laurent Lefebvre) Date: October 17, 2001

Rev 1.0 (Laurent Lefebvre) Date: October 19, 2001 Rev 1.1 (Laurent Lefebyre) Date: October 26, 2001

Rev 1.2 (Laurent Lefebvre) Date: November 16, 2001 Rev 1.3 (Laurent Lefebvre) Date: November 26, 2001 Rev 1.4 (Laurent Lefebvre) Date: December 6, 2001

Rev 1.5 (Laurent Lefebvre) Date: December 11, 2001

Rev 16 (Laurent Lefebyre) Date: January 7, 2002

Rev 1.7 (Laurent Lefebvre) Date: February 4, 2002 Rev 1.8 (Laurent Lefebvre) Date: March 4, 2002

Rev 1.9 (Laurent Lefebvre) Date: March 18, 2002 Rev 1.10 (Laurent Lefebvre) Date: March 25, 2002 Rev 1.11 (Laurent Lefebvre) Date: April 19, 2002 Rev 2.0 (Laurent Lefebvre) Date: April 19, 2002

First draft.

Changed the interfaces to reflect the changes in the SP. Added some details in the arbitration section. Reviewed the Sequencer spec after the meeting on August 3, 2001.

Added the dynamic allocation method for register file and an example (written in part by Vic) of the flow of pixels/vertices in the sequencer. Added timing diagrams (Vic)

Changed the spec to reflect the new R400 architecture. Added interfaces.

Added constant store management, instruction store management, control flow management and data dependant predication.

Changed the control flow method to be more flexible. Also updated the external interfaces.

Incorporated changes made in the 10/18/01 control flow meeting. Added a NOP instruction, removed the conditional\_execute\_or\_jump. Added debug registers.

Refined interfaces to RB. Added state registers.

Added SEQ-SP0 interfaces. Changed delta precision. Changed VGT→SP0 interface. Debug Methods added.

Interfaces greatly refined. Cleaned up the spec.

Added the different interpolation modes.

Added the auto incrementing counters. Changed the VGT-SQ interface. Added content on constant management. Updated GPRs.

Removed from the spec all interfaces that weren't directly tied to the SQ. Added explanations on constant management. Added synchronization fields and explanation.

Added more details on the staging register. Added detail about the parameter caches. Changed the call instruction to a Conditionnal\_call instruction. Added details on constant management and updated the diagram.

Added Real Time parameter control in the SX interface. Updated the control flow section.

New interfaces to the SX block. Added the end of clause modifier, removed the end of clause instructions.

Rearangement of the CF instruction bits in order to ensure byte alignement.

Updated the interfaces and added a section on exporting rules.

Added CP state report interface. Last version of the spec with the old control flow scheme

New control flow scheme



EDIT DATE

4 September, 20152

R400 Sequencer Specification

PAGE 8 of 54

Rev 2.01 (Laurent Lefebvre) Date: May 2, 2002 Changed slightly the control flow instructions to allow force jumps and calls.



EDIT DATE

4 September, 20152

DOCUMENT-REV. NUM. GEN-CXXXXX-REVA PAGE 9 of 54

### 1. Overview

The sequencer chooses two ALU threads and a fetch hread to execute, and executes all of the instructions in a block before looking for a new clause of the same type. Two ALU threads are executed interleaved to hide the ALU latency. The arbitrator will give priority to older threads. There are two separate reservation stations, one for pixel vectors and one for vertices vectors. This way a pixel can pass a vertex and a vertex can pass a pixel.

To support the shader pipe the sequencer also contains the shader instruction cache, constant store, control flow constants and texture state. The four shader pipes also execute the same instruction thus there is only one sequencer for the whole chip.

The sequencer first arbitrates between vectors of 64 vertices that arrive directly from primitive assembly and vectors of 16 quads (64 pixels) that are generated in the scan converter.

The vertex or pixel program specifies how many GPRs it needs to execute. The sequencer will not start the next vector until the needed space is available in the GPRs.



AMD1044\_0257404



# September, 20152

DOCUMENT-REV. NUM.
GEN-CXXXXX-REVA

PAGE 11 of 54

# 1.1 Top Level Block Diagram



Figure 2: Reservation stations and arbiters

Under this new scheme, the sequencer (SQ) will only use one global state management machine per vector type (pixel, vertex) that we call the reservation station (RS).





EDIT DATE

4 September, 20152

DOCUMENT-REV. NUM. GEN-CXXXXX-REVA PAGE 13 of 54

The gray area represents blocks that are replicated 4 times per shader pipe (16 times on the overall chip).

# 1.3 Control Graph



Figure 4: Sequencer Control interfaces

In green is represented the Fetch control interface, in red the ALU control interface, in blue the Interpolated/Vector control interface and in purple is the output file control interface.

# 2. Interpolated data bus

The interpolators contain an IJ buffer to pack the information as much as possible before writing it to the register file.



≿ ?

S o

× √-4

<u>გ</u> ←

**≯** % <del>≠</del>

SP 2

**Y**5

Ą

Figure 6: Interpolation timing diagram

V V 44-60-47 63

3 % <

> 4 4

ω

8

8

8

Щ

8

 $^{\circ}$ 

8

XY XY 44- 60-47 63

3 % ₹

\$ 42 €

S S

2

73711 Bytes\*\*\* @ ATI Confidential. Reference Copyright Notice on Cover Page @ \*\*\*



**EDIT DATE** 

R400 Sequencer Specification

PAGE

24 September, 2001

4 September, 20152

16 of 54

Above is an example of a tile the sequencer might receive from the SC. The write side is how the data get stacked into the XY and IJ buffers, the read side is how the data is passed to the GPRs. The IJ information is packed in the IJ buffer 4 quads at a time or two clocks. The sequencer allows at any given time as many as four quads to interpolate a parameter. They all have to come from the same primitive. Then the sequencer controls the write mask to the GPRs to write the valid data in.

# 3. Instruction Store

There is going to be only one instruction store for the whole chip. It will contain 4096 instructions of 96 bits each.

It is likely to be a 1 port memory; we use 1 clock to load the ALU instruction, 1 clocks to load the Fetch instruction, 1 clock to load 2 control flow instructions and 1 clock to write instructions.

The instruction store is loaded by the CP thru the register mapped registers.

The VS\_BASE and PS\_BASE context registers are used to specify for each context where its shader is in the instruction memory.

For the Real time commands the story is quite the same but for some small differences. There are no wrap-around points for real time so the driver must be careful not to overwrite regular shader data. The shared code (shared subroutines) uses the same path as real time.

#### 4. Sequencer Instructions

All control flow instructions and move instructions are handled by the sequencer only. The ALUs will perform NOPs during this time (MOV PV,PV, PS,PS) if they have nothing else to do.

# 5. Constant Stores

# 5.1 Memory organizations

A likely size for the ALU constant store is 1024x128 bits. The read BW from the ALU constant store is 128 bits/clock and the write bandwidth is 32 bits/clock (directed by the CP bus size not by memory ports).

The maximum logical size of the constant store for a given shader is 256 constants. Or 512 for the pixel/vertex shader pair. The size of the re-mapping table is 128 lines (each line addresses 4 constants). The write granularity is 4 constants or 512 bits. It takes 16 clocks to write the four constants. Real time requires 256 lines in the physical memory (this is physically register mapped).

The texture state is also kept in a similar memory. The size of this memory is 320x96 bits (128 texture states for regular mode, 32 states for RT). The memory thus holds 128 texture states (192 bits per state). The logical size exposes 32 different states total, which are going to be shared between the pixel and the vertex shader. The size of the re-mapping table to for the texture state memory is 32 lines (each line addresses 1 texture state lines in the real memory). The CP write granularity is 1 texture state lines (or 192 bits). The driver sends 512 bits but the CP ignores the top 320 bits. It thus takes 6 clocks to write the texture state. Real time requires 32 lines in the physical memory (this is physically register mapped).

The control flow constant memory doesn't sit behind a renaming table. It is register mapped and thus the driver must reload its content each time there is a change in the control flow constants. Its size is 320\*32 because it must hold 8 copies of the 32 dwords of control flow constants and the loop construct constants must be aligned.

The constant re-mapping tables for texture state and ALU constants are logically register mapped for regular mode and physically register mapped for RT operation.



EDIT DATE

4 September, 20152

DOCUMENT-REV. NUM. GEN-CXXXXX-REVA PAGE 17 of 54

# 5.2 Management of the Control Flow Constants

The control flow constants are register mapped, thus the CP writes to the according register to set the constant, the SQ decodes the address and writes to the block pointed by its current base pointer (CF\_WR\_BASE). On the read side, one level of indirection is used. A register (SQ\_CONTEXT\_MISC.CF\_RD\_BASE) keeps the current base pointer to the control flow block. This register is copied whenever there is a state change. Should the CP write to CF after the state change, the base register is updated with the (current pointer number +1)% number of states. This way, if the CP doesn't write to CF the state is going to use the previous CF constants.

### 5.3 Management of the re-mapping tables

#### 5.3.1 R400 Constant management

The sequencer is responsible to manage two re-mapping tables (one for the constant store and one for the texture state). On a state change (by the driver), the sequencer will broadside copy the contents of its re-mapping tables to a new one. We have 8 different re-mapping tables we can use concurrently.

The constant memory update will be incremental, the driver only need to update the constants that actually changed between the two state changes.

For this model to work in its simplest form, the requirement is that the physical memory MUST be at least twice as large as the logical address space + the space allocated for Real Time. In our case, since the logical address space is 512 and the reserved RT space can be up to 256 entries, the memory must be of sizes 1280 and above. Similarly the size of the texture store must be of 32\*2+32 = 96 entries and above.

#### 5.3.2 Proposal for R400LE constant management

To make this scheme work with only 512+256 = 768 entries, upon reception of a CONTROL packet of state + 1, the sequencer would check for SQ\_IDLE and PA\_IDLE and if both are idle will erase the content of state to replace it with the new state (this is depicted in Figure 8: De-allocation mechanismFigure 8: De-allocation mechanismFigure 8: De-allocation mechanism). Note that in the case a state is cleared a value of 0 is written to the corresponding de-allocation counter location so that when the SQ is going to report a state change, nothing will be de-allocated upon the first report.

The second path sets all context dirty bits that were used in the current state to 1 (thus allowing the new state to reuse these physical addresses if needed).





EDIT DATE

4 September, 20152

DOCUMENT-REV. NUM.
GEN-CXXXXX-REVA

PAGE 19 of 54



Figure 8: De-allocation mechanism for R400LE

### 5.3.3 Dirty bits

Two sets of dirty bits will be maintained per logical address. The first one will be set to zero on reset and set when the logical address is addressed. The second one will be set to zero whenever a new context is written and set for each address written while in this context. The reset dirty is not set, then writing to that logical address will not require de-allocation of whatever address stored in the renaming table. If it is set and the context dirty is not set, then the physical address store needs to be de-allocated and a new physical address is necessary to store the incoming data. If they are both set, then the data will be written into the physical address held in the renaming for the current logical address. No de-allocation or allocation takes place. This will happen when the driver does a set constant twice to the same logical address between context changes. NOTE: It is important to detect and prevent this, failure to do it will allow multiple writes to allocate all physical memory and thus hang because a context will not fit for rendering to start and thus free up space.

### 5.3.4 Free List Block

A free list block that would consist of a counter (called the IFC or Initial Free Counter) that would reset to zero and incremented every time a chunk of physical memory is used until they have all been used once. This counter would be checked each time a physical block is needed, and if the original ones have not been used up, us a new one, else check the free list for an available physical block address. The count is the physical address for when getting a chunk from the counter.

Storage of a free list big enough to store all physical block addresses.

Maintain three pointers for the free list that are reset to zero. The first one we will call write\_ptr. This pointer will identify the next location to write the physical address of a block to be de-allocated. Note: we can never free more physical memory locations than we have. Once recording address the pointer will be incremented to walk the free list like a ring.

The second pointer will be called stop\_ptr. The stop\_ptr pointer will be advanced by the number of address chunks de-allocates when a context finishes. The address between the stop\_ptr and write\_ptr cannot be reused because they are still in use. But as soon as the context using then is dismissed the stop\_ptr will be advanced.

The third pointer will be called read\_ptr. This pointer will point will point to the next address that can be used for allocation as long as the read\_ptr does not equal the stop\_ptr and the IFC is at its maximum count.



24 September, 2001 4 September, 20152

EDIT DATE R400 Sequencer Specification

PAGE 20 of 54

### 5.3.5 De-allocate Block

This block will maintain a free physical address block count for each context. While in current context, a count shall be maintained specifying how many blocks were written into the free list at the write\_ptr pointer. This count will be reset upon reset or when this context is active on the back and different than the previous context. It is actually a count of blocks in the previous context that will no longer be used. This count will be used to advance the write\_ptr pointer to make available the set of physical blocks freed when the previous context was done. This allows the discard or de-allocation of any number of blocks in one clock.

#### 5.3.6 Operation of Incremental model

The basic operation of the model would start with the write\_ptr, stop\_ptr, read\_ptr pointers in the free list set to zero and the free list counter is set to zero. Also all the dirty bits and the previous context will be initialized to zero. When the first set constants happen, the reset dirty bit will not be set, so we will allocate a physical location from the free list counter because its not at the max value. The data will be written into physical address zero. Both the additional copy of the renaming table and the context zeros of the big renaming table will be updated for the logical address that was written by set start with physical address of 0. This process will be repeated for any logical address that are not dirty until the context changes. If a logical address is hit that has its dirty bits set while in the same context, both dirty bits would be set, so the new data will be over-written to the last physical address assigned for this logical address. When the first draw command of the context is detected, the previous context stored in the additional renaming table will be copied to the larger renaming table in the current (new) context location. Then the set constant logical address with be loaded with a new physical address during the copy and if the reset dirty was set, the physical address it replaced in the renaming table would be entered at the write ptr pointer location on the free list and the write ptr will be incremented. The de-allocation counter for the previous context (eight) will be incremented. This as set states come in for this context one of the following will happen:

- 1.) No dirty bits are set for the logical address being updated. A line will be allocated of the free-list counter or the free list at read\_ptr pointer if read\_ptr != to stop\_ptr .
- 2.) Reset dirty set and Context dirty not set. A new physical address is allocated, the physical address in the renaming table is put on the free list at write\_ptr and it is incremented along with the de-allocate counter for
- 3.) Context dirty is set then the data will be written into the physical address specified by the logical address.

This process will continue as long as set states arrive. This block will provide backpressure to the CP whenever he has not free list entries available (counter at max and stop\_ptr == read\_ptr). The command stream will keep a count of contexts of constants in use and prevent more than max constants contexts from being sent.

Whenever a draw packet arrives, the content of the re-mapping table is written to the correct re-mapping table for the context number. Also if the next context uses less constants than the current one all exceeding lines are moved to the free list to be de-allocated later. This happens in parallel with the writing of the re-mapping table to the correct memory.

Now preferable when the constant context leaves the last ALU clause it will be sent to this block and compared with the previous context that left. (Init to zero) If they differ than the older context will no longer be referenced and thus can be de-allocated in the physical memory. This is accomplished by adding the number of blocks freed this context to the stop\_ptr pointer. This will make all the physical addresses used by this context available to the read\_ptr allocate pointer for future allocation.

This device allows representation of multiple contexts of constants data with N copies of the logical address space. It also allows the second context to be represented as the first set plus some new additional data by just storing the delta's. It allows memory to be efficiently used and when the constants updates are small it can store multiple context. However, if the updates are large, less contexts will be stored and potentially performance will be degraded. Although it will still perform as well as a ring could in this case.

# 5.4 Constant Store Indexing

In order to do constant store indexing, the sequencer must be loaded first with the indexes (that come from the GPRs). There are 144 wires from the exit of the SP to the sequencer (9 bits pointers x 16 vertexes/clock). Since the data must pass thru the Shader pipe for the float to fixed conversion, there is a latency of 4 clocks (1 instruction)



EDIT DATE

DOCUMENT-REV. NUM. GEN-CXXXXX-REVA PAGE 21 of 54

between the time the sequencer is loaded and the time one can index into the constant store. The assembly will look like this

MOVA R1.X,R2.X // Loads the sequencer with the content of R2.X, also copies the content of R2.X into R1.X NOP // latency of the float to fixed conversion

ADD R3,R4,C0[R2.X]// Uses the state from the sequencer to add R4 to C0[R2.X] into R3

Note that we don't really care about what is in the brackets because we use the state from the MOVA instruction. R2.X is just written again for the sake of simplicity and coherency.

The storage needed in the sequencer in order to support this feature is 2\*64\*9 bits = 1152 bits.

### 5.5 Real Time Commands

The real time commands constants are written by the CP using the register mapped registers allocated for RT. It works is the same way than when dealing with regular constant loads BUT in this case the CP is not sending a logical address but rather a physical address and the reads are not passing thru the re-mapping table but are directly read from the memory. The boundary between the two zones is defined by the CONST\_EO\_RT control register. Similarly, for the fetch state, the boundary between the two zones is defined by the TSTATE\_EO\_RT control register.

### 5.6 Constant Waterfalling

In order to have a reasonable performance in the case of constant store indexing using the address register, we are going to have the possibility of using the physical memory port for read only. This way we can read 1 constant per clock and thus have a worst-case waterfall mode of 1 vertex per clock. There is a small synchronization issue related with this as we need for the SQ to make sure that the constants where actually written to memory (not only sent to the sequencer) before it can allow the first vector of pixels or vertices of the state to go thru the ALUs. To do so, the sequencer keeps 8 bits (one per render state) and sets the bits whenever the last render state is written to memory and clears the bit whenever a state is freed.



Figure 9: The instruction store



EDIT DATE

R400 Sequencer Specification

PAGE 22 of 54

24 September, 2001 4 September, 20152

# 6. Looping and Branches

Loops and branches are planned to be supported and will have to be dealt with at the sequencer level. We plan on supporting constant loops and branches using a control program.

# 6.1 The controlling state.

The R400 controling state consists of:

Boolean[256:0] Loop\_count[7:0][31:0] Loop\_Start[7:0][31:0] Loop\_Step[7:0][31:0]

That is 256 Booleans and 32 loops.

We have a stack of 4 elements for nested calls of subroutines and 4 loop counters to allow for nested loops.

This state is available on a per shader program basis.

# 6.2 The Control Flow Program

We'd like to be able to code up a program of the form:

1: Loop
2: Exec TexFetch
3: TexFetch
4: ALU
5: ALU
6: TexFetch
7: End Loop
8: ALU Export

But realize that 3: may be dependent on 2: and 4: is almost certainly dependent on 2: and 3:. Without clausing, these dependencies need to be expressed in the Control Flow instructions. Additionally, without separate 'texture clauses' and 'ALU clauses' we need to know which instructions to dispatch to the Texture Unit and which to the ALU unit. This information will be encapsulated in the flow control instructions.

Each control flow instruction will contain 2 bits of information for each (non-control flow) instruction:

- a) ALU or Texture
- b) Serialize Execution

(b) would force the thread to stop execution at this point (before the instruction is executed) and wait until all textures have been fetched. Given the allocation of reserved bits, this would mean that the count of an 'Exec' instruction would be limited to about 8 (non-control-flow) instructions. If more than this were needed, a second Exec (with the same conditions) would be issued.

Another function that relies upon 'clauses' is allocation and order of execution. We need to assure that pixels and vertices are exported in the correct order (even if not all execution is ordered) and that space in the output buffers are allocated in order. Additionally data can't be exported until space is allocated. A new control flow instruction:

Alloc <buf>fer select -- position,parameter, pixel or vertex memory. And the size required>.

would be created to mark where such allocation needs to be done. To assure allocation is done in order, the actual allocation for a given thread can not be performed unless the equivalent allocation for all previous threads is already completed. The implementation would also assure that execution of instruction(s) following the serialization due to the Alloc will occur in order -- at least until the next serialization or change from ALU to Texture. In most cases this will allow the exports to occur without any further synchronization. Only 'final' allocations or position allocations are



EDIT DATE

4 September, 20152

DOCUMENT-REV. NUM. GEN-CXXXXX-REVA PAGE 23 of 54

guaranteed to be ordered. Because strict ordering is required for pixels, parameters and positions, this implies only a single alloc for these structures. Vertex exports to memory do not require ordering during allocation and so multiple 'allocs' may be done.

#### 6.2.1 Control flow instructions table

Here is the revised control flow instruction set.

Note that whenever a field is marked as RESERVED, it is assumed that all the bits of the field are cleared (0).

|            | Execute                         |          |                                  |       |              |  |  |  |
|------------|---------------------------------|----------|----------------------------------|-------|--------------|--|--|--|
| 47         | 47 46 43 40 34 33 16 15 12 11 0 |          |                                  |       |              |  |  |  |
| Addressing | 0001                            | RESERVED | Instructions type + serialize (9 | Count | Exec Address |  |  |  |
|            |                                 |          | instructions)                    |       |              |  |  |  |

Execute up to 9 instructions at the specified address in the instruction memory. The Instruction type field tells the sequencer the type of the instruction (LSB) (1 = Texture, 0 = ALU and whether to serialize or not the execution (MSB) (1 = Serialize, 0 = Non-Serialized).

|            | NOP               |          |   |  |  |  |
|------------|-------------------|----------|---|--|--|--|
| 47         | 47   46 43   42 0 |          |   |  |  |  |
| Addressing | 0010              | RESERVED | 1 |  |  |  |

This is a regular NOP.

| Conditional_Execute                           |  |  |                                                                |  |       |              |
|-----------------------------------------------|--|--|----------------------------------------------------------------|--|-------|--------------|
| 47   46 43   42   41 34   3316   15 12   11 0 |  |  |                                                                |  |       |              |
| Addressing                                    |  |  | Boolean Instructions type + serialize (9 address instructions) |  | Count | Exec Address |

If the specified Boolean (8 bits can address 256 Booleans) meets the specified condition then execute the specified instructions (up to 9 instructions). If the condition is not met, we go on to the next control flow instruction.

| Conditional_Execute_Predicates                       |      |           |          |                     |                                                      |       |              |  |
|------------------------------------------------------|------|-----------|----------|---------------------|------------------------------------------------------|-------|--------------|--|
| 47   46 43   42   41 36   35 34   3316   1512   11 0 |      |           |          |                     |                                                      |       |              |  |
| Addressing                                           | 0010 | Condition | RESERVED | Predicate<br>vector | Instructions<br>type + serialize<br>(9 instructions) | Count | Exec Address |  |

Check the AND/OR of all current predicate bits. If AND/OR matches the condition execute the specified number of instructions. We need to AND/OR this with the kill mask in order not to consider the pixels that aren't valid. If the condition is not met, we go on to the next control flow instruction.

|            | Loop_Start |          |         |                                     |              |  |  |  |
|------------|------------|----------|---------|-------------------------------------|--------------|--|--|--|
| 47         | 46 43      | 42 17    | 20 16   | 1512 <del>16</del><br><del>12</del> | 11 0         |  |  |  |
| Addressing | 0101       | RESERVED | loop ID | RESERVEDIO                          | Jump address |  |  |  |

Loop Start. Compares the loop iterator with the end value. If loop condition not met jump to the address. Forward jump only. Also computes the index value. The loop id must match between the start to end, and also indicates which control flow constants should be used with the loop.



EDIT DATE

4 September, 20152

May 200319 April

R400 Sequencer Specification

PAGE 24 of 54

| Loop_End   |       |                |                 |                            |                     |               |  |
|------------|-------|----------------|-----------------|----------------------------|---------------------|---------------|--|
| 47         | 46 43 | 42 20 <u>4</u> | <u>23 21</u>    | <u>20 16</u> 19 17         | 151216<br>12        | 11 0          |  |
| Addressing | 0011  | RESERVED       | Predicate break | loop ID<br>Predicate-break | RESERVED<br>loop-ID | start address |  |

Loop end. Increments the counter by one, compares the loop count with the end value. If loop condition met, continue, else, jump BACK to the start of the loop. If predicate break != 0, then compares predicate vector n (specified by predicate break number). If all bits cleared then break the loop.

The way this is described does not prevent nested loops, and the inclusion of the loop id make this easy to do.

| Conditionnal_Call |                                                     |           |                                 |          |            |              |
|-------------------|-----------------------------------------------------|-----------|---------------------------------|----------|------------|--------------|
| 47                | 47 46 43 42 35-41 34 33 1 <u>3</u> 2 <u>12</u> 11 0 |           |                                 |          |            |              |
| Addressing        | 0111                                                | Condition | Predicate vectorBoolean address | RESERVED | Force Call | Jump address |

If the condition is met, jumps to the specified address and pushes the control flow program counter on the stack. If force call is set the condition is ignored and the call is made always.

| Return     |       |          |   |  |  |
|------------|-------|----------|---|--|--|
| 47         | 46 43 | 42 0     |   |  |  |
| Addressing | 1000  | RESERVED | 8 |  |  |

Pops the topmost address from the stack and jumps to that address. If nothing is on the stack, the program will just continue to the next instruction.

|   | Conditionnal_Jump |       |           |         |         |          |            |              |
|---|-------------------|-------|-----------|---------|---------|----------|------------|--------------|
|   | 47                | 46 43 | 42        | 41 34   | 33      | 32 132   | <u>12</u>  | 11 0         |
| - | Addressing        | 1001  | Condition | Boolean | FW only | RESERVED | Force Jump | Jump address |
|   |                   |       |           | address |         |          |            |              |

If force jump is set the condition is ignored and the jump is made always. If FW only is set then only forward jumps are allowed.

| Allocate |       |               |          |                 |  |
|----------|-------|---------------|----------|-----------------|--|
| 47       | 46 43 | 4241          | 40 4     | 30              |  |
| Debug    | 1010  | Buffer Select | RESERVED | Allocation size |  |

Buffer Select takes a value of the following:

01 - position export (ordered export)

10 - parameter cache or pixel export (ordered export)

11 - pass thru (out of order exports).

If debug is set this is a debug alloc (ignore if debug DB\_ON register is set to off).

| End Of Program         |  |  |  |  |
|------------------------|--|--|--|--|
| 47 46 43 42 0          |  |  |  |  |
| RESERVED 1011 RESERVED |  |  |  |  |

Marks the end of the program.

# 6.3 Implementation



EDIT DATE
4 September, 20152

DOCUMENT-REV. NUM. GEN-CXXXXX-REVA PAGE 25 of 54

The envisioned implementation has a buffer that maintains the state of each thread. A thread lives in a given location in the buffer during its entire life, but the buffer has FIFO qualities in that threads leave in the order that they enter. Actually two buffers are maintained — one for Vertices and one for Pixels. The intended implementation would allow for:

16 entries for vertices 48 entries for pixels.

From each buffer, arbitration logic attempts to select 1 thread for the texture unit and 1 (interleaved) thread for the ALU unit. Once a thread is selected it is read out of the buffer, marked as invalid, and submitted to appropriate execution unit. It is returned to the buffer (at the same place) with its status updated once all possible sequential instructions have been executed. A switch from ALU to TEX or visa-versa or a Serialize\_Execution modifier forces the thread to be returned to the buffer.

Each entry in the buffer will be stored across two physical pieces of memory - most bits will be stored in a 1 read port device. Only bits needed for thread arbitration will be stored in a highly multi-ported structure. The bits kept in the 1 read port device will be termed 'state'. The bits kept in the multi-read ported device will be termed 'status'.

#### 'State Bits' needed include:

- 1. Control Flow Instruction Pointer (12\_13 bits),
- 2. Execution Count Marker 4 bits),
- 3. Loop Iterators (4x9 bits),
- 4. Call return pointers (4x12 bits),
- 5. Predicate Bits (4x64 bits),
- 6. Export ID (1 bit),
- 7. Parameter Cache base Ptr (7 bits),
- 8. GPR Base Ptr (8 bits),
- 9. Context Ptr (3 bits).
- 10. LOD corrections (6x16 bits)
- 11. Valid bits (64 bits)

Absent from this list are 'Index' pointers. These are costly enough that I'm presuming that they are instead stored in the GPRs. The first seven fields above (Control Flow Ptr, Execution Count, Loop Counts, call return ptrs, Predicate bits, PC base ptr and export ID) are updated every time the thread is returned to the buffer based on how much progress has been mode on thread execution. GPR Base Ptr, Context Ptr and LOD corrections are unchanged throughout execution of the thread.

#### 'Status Bits' needed include:

- Valid Thread
- Texture/ALU engine needed
- Texture Reads are outstanding
- · Waiting on Texture Read to Complete
- Allocation Wait (2 bits)
- 00 No allocation needed
- 01 Position export allocation needed (ordered export)
- 10 Parameter or pixel export needed (ordered export)
- 11 pass thru (out of order export)
- · Allocation Size (4 bits)
- Position Allocated
- First thread of a new context
- · Event thread (NULL thread that needs to trickle down the pipe)
- Last (1 bit)
- Pulse SX (1 bit)

Formatted: Bullets and Numbering

Formatted: Bullets and Numbering



EDIT DATE

R400 Sequencer Specification

PAGE 26 of 54

24 September, 2001

4 September, 20152

All of the above fields from all of the entries go into the arbitration circuitry. The arbitration circuitry will select a winner for both the Texture Engine and for the ALU engine. There are actually two sets of arbitration -- one for pixels and one for vertices. A final selection is then done between the two. But the rest of this implementation summary only considers the 'first' level selection which is similar for both pixels and vertices.

Texture arbitration requires no allocation or ordering so it is purely based on selecting the 'oldest' thread that requires the Texture Engine.

ALU arbitration is a little more complicated. First, only threads where either of Texture\_Reads\_outstanding or Waiting\_on\_Texture\_Read\_to\_Complete are '0' are considered. Then if Allocation\_Wait is active, these threads are further filtered based on whether space is available. If the allocation is position allocation, then the thread is only considered if all 'older' threads have already done their position allocation (position allocated bits set). If the allocation is parameter or pixel allocation, then the thread is only considered if it is the oldest thread. Also a thread is not considered if it is a parameter or pixel or position allocation, has its First\_thread\_of\_a\_new\_context bit set and would cause ALU interleaving with another thread performing the same parameter or pixel or position allocation. Finally the 'oldest' of the threads that pass through the above filters is selected. If the thread needed to allocate, then at this time the allocation is done, based on Allocation\_Size. If a thread has its "last" bit set, then it is also removed from the buffer, never to return.

If I now redefine 'clauses' to mean 'how many times the thread is removed from the thread buffer for the purpose of exection by either the ALU or Texture engine', then the minimum number of clauses needed is 2 -- one to perform the allocation for exports (execution automatically halts after an 'Alloc' instruction) (but doesn't performs the actual allocation) and one for the actual ALU/export instructions. As the 'Alloc' instruction could be part of a texture clause (presumably the final instruction in such a clause), a thread could still execute in this minimal number of 2 clauses, even if it involved texture fetching.

The Texture\_Reads\_Outstanding bit must be updated by the sequencer, based on keeping track of how many Texture Clauses have been executed by a given thread that have not yet had there data returned. Any number above 0 results in this bit being set. We could consider forcing synchronization such that two texture clauses for a given thread may not be outstanding at any time (that would be my preference for simplicity reasons and because it would require only very little change in the texture pipe interface). This would allow the sequencer to set the bit on execution of the texture clause, and allow the texture unit to return a pointer to the thread buffer on completion that clears the bit

# 6.4 Data dependant predicate instructions

Data dependant conditionals will be supported in the R400. The only way we plan to support those is by supporting three vector/scalar predicate operations of the form:

PRED\_SETE\_# - similar to SETE except that the result is 'exported' to the sequencer.

PRED\_SETNE\_# - similar to SETNE except that the result is 'exported' to the sequencer.

PRED\_SETGT\_# - similar to SETGT except that the result is 'exported' to the sequencer

PRED\_SETGTE\_# - similar to SETGTE except that the result is 'exported' to the sequencer

For the scalar operations only we will also support the two following instructions:

PRED\_SETÉ0\_# - SETE0 PRED\_SETE1\_# - SETE1

The export is a single bit - 1 or 0 that is sent using the same data path as the MOVA instruction. The sequencer will maintain 4 sets of 64 bit predicate vectors (in fact 8 sets because we interleave two programs but only 4 will be exposed) and use it to control the write masking. This predicate is not maintained across clause boundaries. The # sign is used to specify which predicate set you want to use 0 thru 3.

Then we have two conditional execute bits. The first bit is a conditional execute "on" bit and the second bit tells us if we execute on 1 or 0. For example, the instruction:

P0\_ADD\_# R0,R1,R2

Exhibit 2029.docR490\_Sequencer.doc 73711 Bytes\*\*\* © ATI Confidential. Reference Copyright Notice on Cover Page © \*\*\*

AMD1044 0257420



EDIT DATE

4 September, 20152

DOCUMENT-REV. NUM. GEN-CXXXXX-REVA PAGE 27 of 54

Is only going to write the result of the ADD into those GPRs whose predicate bit is 0. Alternatively, P1\_ADD\_# would only write the results to the GPRs whose predicate bit is set. The use of the P0 or P1 without precharging the sequencer with a PRED instruction is undefined.

{Issue: do we have to have a NOP between PRED and the first instruction that uses a predicate?}

# 6.5 HW Detection of PV,PS

Because of the control program, the compiler cannot detect statically dependant instructions. In the case of nonmasked writes and subsequent reads the sequencer will insert uses of PV,PS as needed. This will be done by comparing the read address and the write address of consecutive instructions. For masked writes, the sequencer will insert NOPs wherever there is a dependant read/write.

The sequencer will also have to insert NOPs between PRED\_SET and MOVA instructions and their uses.

### 6.6 Register file indexing

Because we can have loops in fetch clause, we need to be able to index into the register file in order to retrieve the data created in a fetch clause loop and use it into an ALU clause. The instruction will include the base address for register indexing and the instruction will contain these controls:

| Bit7 | Bit 6 |                     |
|------|-------|---------------------|
| 0    | 0     | 'absolute register  |
| 0    | 1     | 'relative register' |
| 1    | 0     | 'previous vector'   |
| 1    | 1     | 'previous scalar'   |

In the case of an absolute register we just take the address as is. In the case of a relative register read we take the base address and we add to it the loop\_index and this becomes our new address that we give to the shader pipe.

The sequencer is going to keep a loop index computed as such:

Index = Loop\_iterator\*Loop\_step + Loop\_start.

We loop until loop\_iterator = loop\_count. Loop\_step is a signed value [-128...127]. The computed index value is a 10 bit counter that is also signed. Its real range is [-256,256]. The tenth bit is only there so that we can provide an out of range value to the "indexing logic" so that it knows when the provided index is out of range and thus can make the necessary arrangements.

# 6.7 Debugging the Shaders

In order to be able to debug the pixel/vertex shaders efficiently, we provide 2 methods.

### 6.7.1 Method 1: Debugging registers

Current plans are to expose 2 debugging, or error notification, registers:

- 1. address register where the first error occurred
- 2. count of the number of errors

The sequencer will detect the following groups of errors:

- count overflow
- constant indexing overflow
- register indexing overflow

Compiler recognizable errors:

- jump errors
  - relative jump address > size of the control flow program
- call stack

call with stack full return with stack empty



EDIT DATE

R400 Sequencer Specification

PAGE 28 of 54

24 September, 2001

4 September, 20152

nber, 20152

A jump error will always cause the program to break. In this case, a break means that a clause will halt execution, but allowing further clauses to be executed.

With all the other errors, program can continue to run, potentially to worst-case limits. The program will only break if the DB\_PROB\_BREAK register is set.

If indexing outside of the constant or the register range, causing an overflow error, the hardware is specified to return the value with an index of 0. This could be exploited to generate error tokens, by reserving and initializing the 0th register (or constant) for errors.

{ISSUE : Interrupt to the driver or not?}

### 6.7.2 Method 2: Exporting the values in the GPRs

1) The sequencer will have a debug active, count register and an address register for this mode.

Under the normal mode execution follows the normal course.

Under the debug mode it is assumed that the program is always exporting n debug vectors and that all other exports to the SX block (position, color, z, ect) will been turned off (changed into NOPs) by the sequencer (even if they occur before the address stated by the ADDR debug register).

### 7. Pixel Kill Mask

A vector of 64 bits is kept by the sequencer per group of pixels/vertices. Its purpose is to optimize the texture fetch requests and allow the shader pipe to kill pixels using the following instructions:

MASK\_SETE MASK\_SETNE MASK\_SETGT MASK\_SETGTE

# 8. Multipass vertex shaders (HOS)

Multipass vertex shaders are able to export from the 6 last clauses but to memory ONLY.

### 9. Register file allocation

The register file allocation for vertices and pixels can either be static or dynamic. In both cases, the register file in managed using two round robins (one for pixels and one for vertices). In the dynamic case the boundary between pixels and vertices is allowed to move, in the static case it is fixed to 128-VERTEX\_REG\_SIZE for vertices and PIXEL REG\_SIZE for pixels.



Above is an example of how the algorithm works. Vertices come in from top to bottom; pixels come in from bottom to top. Vertices are in orange and pixels in green. The blue line is the tail of the vertices and the green line is the tail of the pixels. Thus anything between the two lines is shared. When pixels meets vertices the line turns white and the boundary is static until both vertices and pixels share the same "unallocated bubble". Then the boundary is allowed to move again. The numbering of the GPRs starts from the bottom of the picture at index 0 and goes up to the top at index 127.

# 10. Fetch Arbitration

The fetch arbitration logic chooses one of the 8 potentially pending fetch clauses to be executed. The choice is made by looking at the fifos from 7 to 0 and picking the first one ready to execute. Once chosen, the clause state machine will send one 2x2 fetch per clock (or 4 fetches in one clock every 4 clocks) until all the fetch instructions of the clause are sent. This means that there cannot be any dependencies between two fetches of the same clause.

The arbitrator will not wait for the fetches to return prior to selecting another clause for execution. The fetch pipe will be able to handle up to X(?) in flight fetches and thus there can be a fair number of active clauses waiting for their fetch return data.

# 11. ALU Arbitration

ALU arbitration proceeds in almost the same way than fetch arbitration. The ALU arbitration logic chooses one of the 8 potentially pending ALU clauses to be executed. The choice is made by looking at the fifos from 7 to 0 and picking the first one ready to execute. There are two ALU arbiters, one for the even clocks and one for the odd clocks. For example, here is the sequencing of two interleaved ALU clauses (E and O stands for Even and Odd sets of 4 clocks):

Einst0 Oinst0 Einst1 Oinst1 Einst2 Oinst2 Einst0 Oinst3 Einst1 Oinst4 Einst2 Oinst0...

Proceeding this way hides the latency of 8 clocks of the ALUs. Also note that the interleaving also occurs across clause boundaries.



4 September, 20152

**EDIT DATE** 

R400 Sequencer Specification

PAGE 30 of 54

24 September, 2001

# 12. Handling Stalls

When the output file is full, the sequencer prevents the ALU arbitration logic from selecting the last clause (this way nothing can exit the shader pipe until there is place in the output file. If the packet is a vertex packet and the position buffer is full (POS FULL) then the sequencer also prevents a thread from entering the exporting clause (3?). The sequencer will set the OUT FILE FULL signal n clocks before the output file is actually full and thus the ALU arbiter will be able read this signal and act accordingly by not preventing exporting clauses to proceed.

# 13. Content of the reservation station FIFOs

The reservation FIFOs contain the state of the vector of pixels and vertices. We have two sets of those: one for pixels, and one for vertices. They contain 3 bits of Render State 7 bits for the base address of the GPRs, some bits for LOD correction and coverage mask information in order to fetch fetch for only valid pixels, the quad address.

### 14. The Output File

The output file is where pixels are put before they go to the RBs. The write BW to this store is 256 bits/clock. Just before this output file are staging registers with write BW 512 bits/clock and read BW 256 bits/clock. The staging registers are 4x128 (and there are 16 of those on the whole chip).

#### 15. IJ Format

The IJ information sent by the PA is of this format on a per quad basis:

We have a vector of IJ's (one IJ per pixel at the centroid of the fragment or at the center of the pixel depending on the mode bit). The interpolation is done at a different precision across the 2x2. The upper left pixel's parameters are always interpolated at full 20x24 mantissa precision. Then the result of the interpolation along with the difference in IJ in reduced precision is used to interpolate the parameter for the other three pixels of the 2x2. Here is how we do it:

Assuming P0 is the interpolated parameter at Pixel 0 having the barycentric coordinates I(0), J(0) and so on for P1,P2 and P3. Also assuming that A is the parameter value at V0 (interpolated with I), B is the parameter value at V1 (interpolated with J) and C is the parameter value at V2 (interpolated with (1-I-J).

$$\Delta 01I = I(1) - I(0)$$

$$\Delta 01J = J(1) - J(0)$$

$$\Delta 02I = I(2) - I(0)$$

$$\Delta 02J = J(2) - J(0)$$

$$\Delta 03I = I(3) - I(0)$$

$$\Delta 03J = J(3) - J(0)$$

$$P0 = C + I(0)*(A-C) + J(0)*(B-C)$$

$$P1 = P0 + \Delta 01I * (A - C) + \Delta 01J * (B - C)$$

$$P2 = P0 + \Delta 02I * (A - C) + \Delta 02J * (B - C)$$

$$P3 = P0 + \Delta 03I * (A - C) + \Delta 03J * (B - C)$$

P0 is computed at 20x24 mantissa precision and P1 to P3 are computed at 8X24 mantissa precision. So far no visual degradation of the image was seen using this scheme.

Multiplies (Full Precision): 2 Multiplies (Reduced precision): 6 Subtracts 19x24 (Parameters): 2



EDIT DATE

4 September, 20152

DOCUMENT-REV. NUM.
GEN-CXXXXX-REVA

PAGE 31 of 54

Adds: 8

FORMAT OF P0's IJ: Mantissa 20 Exp 4 for I + Sign

Mantissa 20 Exp 4 for J + Sign

FORMAT of Deltas (x3): Mantissa 8 Exp 4 for I + Sign Mantissa 8 Exp 4 for J + Sign

Total number of bits:  $20^2 + 8^6 + 4^8 + 4^2 = 128$ 

All numbers are kept using the un-normalized floating point convention: if exponent is different than 0 the number is normalized if not, then the number is un-normalized. The maximum range for the IJs (Full precision) is +/- 63 and the range for the Deltas is +/- 127.

# 15.1 Interpolation of constant attributes

Because of the floating point imprecision, we need to take special provisions if all the interpolated terms are the same or if two of the barycentric coordinates are the same.

We start with the premise that if A = B and B = C and C = A, then P0,1,2,3 = A. Since one or more of the IJ terms may be zero, so we extend this to:

```
if (A=B and B=C and C=A)
  P0,1,2,3 = A;
else if ((I = 0) \text{ or } (J = 0)) and
       ((J = 0) \text{ or } (1-I-J = 0)) \text{ and }
       ((1-J-I=0) \text{ or } (I=0))) {
           if(I != 0) {
              P0 = A
           } else if(J != 0) {
              P0 = B;
           } else {
              P0 = C:
         //rest of the quad interpolated normally
}
else
{
          normal interpolation
}
```

# 16. Staging Registers

In order for the reuse of the vertices to be 14, the sequencer will have to re-order the data sent IN ORDER by the VGT for it to be aligned with the parameter cache memory arrangement. Given the following group of vertices sent by the VGT:

 $0\ 1\ 2\ 3\ 4\ 5\ 6\ 7\ 8\ 9\ 10\ 11\ 12\ 13\ 14\ 15\ ||\ 16\ 17\ 18\ 19\ 20\ 21\ 22\ 23\ 24\ 25\ 26\ 27\ 28\ 29\ 30\ 31\ ||\ 32\ 33\ 34\ 35\ 36\ 37\ 38\ 39\ 40\ 41\ 42\ 43\ 44\ 45\ 46\ 47\ ||\ 48\ 49\ 50\ 51\ 52\ 53\ 54\ 55\ 56\ 57\ 58\ 59\ 60\ 61\ 62\ 63$ 

The sequencer will re-arrange them in this fashion:

0 1 2 3 16 17 18 19 32 33 34 35 48 49 50 51  $\parallel$  4 5 6 7 20 21 22 23 36 37 38 39 52 53 54 55  $\parallel$  8 9 10 11 24 25 26 27 40 41 42 43 56 57 58 59  $\parallel$  12 13 14 15 28 29 30 31 44 45 46 47 60 61 62 63

The || markers show the SP divisions. In the event a shader pipe is broken, the VGT will send padding to account for the missing pipe. For example, if SP1 is broken, vertices 4 5 6 7 20 21 22 23 36 37 38 39 52 53 54 55 will still be sent by the VGT to the SQ **BUT** will not be processed by the SP and thus should be considered invalid (by the SU and VGT).



EDIT DATE
4 September, 20152

R400 Sequencer Specification

PAGE 32 of 54

The most straightforward, *non-compressed* interface method would be to convert, in the VGT, the data to 32-bit floating point prior to transmission to the VSISRs. In this scenario, the data would be transmitted to (and stored in) the VSISRs in full 32-bit floating point. This method requires three 24-bit fixed-to-float converters in the VGT. Unfortunately, it also requires and additional 3,072 bits of storage across the VSISRs. This interface is illustrated in <u>Figure 11Figure 11Figure 11</u>. The area of the fixed-to-float converters and the VSISRs for this method is roughly estimated as 0.759sqmm using the R300 process. The gate count estimate is shown in <u>Figure 10Figure 10Figure 10</u>.

| Basis for 8-deep Latch Memory (fron   | n R300)    |                          |                               |  |
|---------------------------------------|------------|--------------------------|-------------------------------|--|
| 8x24-bit                              | 11631      | $\mu^2$                  | $60.57813\mu^2\text{per bit}$ |  |
| Area of 96x8-deep Latch Memory        | 46524      | $\mu^2$                  |                               |  |
| Area of 24-bit Fix-to-float Converter | 4712       | μ <sup>2</sup> per conve | erter                         |  |
| Method 1                              | Block      | Quantity                 | Area                          |  |
|                                       | F2F        | 3                        | 14136                         |  |
|                                       | 8x96 Latch | 16_                      | 744384                        |  |
|                                       |            |                          | 758520 µ²                     |  |

Figure 10:Area Estimate for VGT to Shader Interface



EDIT DATE

4 September, 20152

DOCUMENT-REV. NUM.
GEN-CXXXXX-REVA

PAGE 33 of 54



Figure 11:VGT to Shader Interface

# 17. The parameter cache

The parameter cache is where the vertex shaders export their data. It consists of 16 128x128 memories (1R/1W). The reuse engine will make it so that all vertexes of a given primitive will hit different memories. The allocation method for these memories is a simple round robin. The parameter cache pointers are mapped in the following way: 4MSBs are the memory number and the 7 LSBs are the address within this memory.

| MEMORY NUMBER | ADDRESS |
|---------------|---------|
| 4 bits        | 7 bits  |

The PA generates the parameter cache addresses as the positions come from the SQ. All it needs to do is keep a Current\_Location pointer (7 bits only) and as the positions comes increment the memory number. When the memory number field wraps around, the PA increments the Current\_Location by VS\_EXPORT\_COUNT (a snooped register from the SQ). As an example, say the memories are all empty to begin with and the vertex shader is exporting 8 parameters per vertex (VS\_EXPORT\_COUNT = 8). The first position received is going to have the PC address 00000000000 the second one 00010000000, third one 00100000000 and so on up to 11110000000. Then the next position received (the 17<sup>th</sup>) is going to have the address 0000001000, the 18<sup>th</sup> 00010001000, the 19<sup>th</sup> 00100001000 and so on. The Current\_location is NEVER reset BUT on chip resets. The only thing to be careful about is that if the SX doesn't send you a full group of positions (<64) then you need to fill the address space so that the next group starts correctly aligned (for example if you receive only 33 positions then you need to add 2\*VS\_EXPORT\_COUNT to Current\_Location and reset the memory count to 0 before the next vector begins).



EDIT DATE

R400 Sequencer Specification

PAGE 34 of 54

24 September, 2001 4 September, 20152

# 17.1 Export restrictions

#### 17.1.1 Pixel exports:

Pixels can export 1,2,3 or 4 color buffers to the SX( +z). The exports will be done in order. The PRED\_OPTIMIZE function has to be turned of if the exports are done using interleaved predicated instructions. The exports will always be ordered to the SX.

#### 17.1.2 Vertex exports:

Position or parameter caches can be exported in any order in the shader program. It is always better to export posistion as soon as possible. Position has to be exported in a single export block (no texture instructions can be placed between the exports). Parameter cache exports can be done in any order with texture instructions interleaved. The PRED\_OPTIMIZE function has to be turned of if the exports are done using interleaved predicated instructions to the Parameter cache (see Arbitration restrictions for details). The exports will always be allocated in order to the SX.

#### 17.1.3 Pass thru exports:

Pass thru exports have to be done in groups of the form:

Alloc 4 (8 or 12)
Execute ALU(ADDR) ALU(DATA) ALU(DATA) ALU(DATA)...

They cannot have texture instructions interleaved in the export block. These exports are not guaranteed to be ordered.

Also, when doing a pass thru export, Position MUST be exported AFTER all pass thru exports. This position export is used to synchronize the chip when doing a transition from pass thru shader to regular shader and vice versa.

#### 17.2 Arbitration restrictions

Here are the Sequencer arbitration restrictions:

- 1) Cannot execute a serialized thread if the corresponding texture pending bit is set
- 2) Cannot allocate position if any older thread has not allocated position
- 3) If last thread is marked as not valid AND marked as last and we are about to execute the second to oldest thread also marked last then:
  - a. Both threads must be from the same context (cannot allow a first thread)
  - b. Must turn off the predicate optimization for the second thread
- 4) Cannot execute a texture clause if texture reads are pending
- 5) Cannot execute last if texture pending (even if not serial)

# 18. Export Types

The export type (or the location where the data should be put) is specified using the destination address field in the ALU instruction. Here is a list of all possible export modes:

# 18.1 Vertex Shading

0:15 - 16 parameter cache 16:31 - Empty (Reserved?)

32 - Export Address

33:40 - 8 vertex exports to the frame buffer and index

41:47 - Empty

48:55 - 8 debug export (interpret as normal vertex export)

60 - export addressing mode

61 - Empty 62 - position



EDIT DATE

4 September, 20152

DOCUMENT-REV. NUM. GEN-CXXXXX-REVA PAGE 35 of 54

- sprite size export that goes with position export (point\_h,point\_w,edgeflag,misc)

# 18.2 Pixel Shading

O - Color for buffer 0 (primary)

1 - Color for buffer 12 - Color for buffer 2

3 - Color for buffer 3

4:7 - Empty

8 - Buffer 0 Color/Fog (primary)

9 - Buffer 1 Color/Fog

10 - Buffer 2 Color/Fog11 - Buffer 3 Color/Fog

12:15 - Empty

16:31 - Empty (Reserved?)

32 - Export Address

33:40 - 8 exports for multipass pixel shaders.

41:47 - Empty

48:55 - 8 debug exports (interpret as normal pixel export)

60 - export addressing mode

61:62 - Empty

- Z for primary buffer (Z exported to 'alpha' component)

# 19. Special Interpolation modes

### 19.1 Real time commands

We are unable to use the parameter memory since there is no way for a command stream to write into it. Instead we need to add three 16x128 memories (one for each of three vertices x 16 interpolants). These will be mapped onto the register bus and written by type 0 packets, and output to the the parameter busses (the sequencer and/or PA need to be able to address the reatime parameter memory as well as the regular parameter store. For higher performance we should be able able to view them as two banks of 16 and do double buffering allowing one to be loaded, while the other is rasterized with. Most overlay shaders will need 2 or 4 scalar coordinates, one option might be to restrict the memory to 16x64 or 32x64 allowing only two interpolated scalars per cycle, the only problem I see with this is, if we view support for 16 vector-4 interpolants important (true only if we map Microsoft's high priority stream to the realtime stream), then the PA/sequencer need to support a realtime-specific mode where we need to address 32 vectors of parameters instead of 16. This mode is triggered by the primitive type: REAL TIME. The actual memories are in the in the SX blocks. The parameter data memories are hooked on the RBBM bus and are loaded by the CP using register mapped memory.

# 19.2 Sprites/ XY screen coordinates/ FB information

When working with sprites, one may want to overwrite the parameter 0 with SC generated data. Also, XY screen coordinates may be needed in the shader program. This functionality is controlled by the gen\_I0 register (in SQ) in conjunction with the SND\_XY register (in SC). Also it is possible to send the faceness information (for OGL front/back special operations) to the shader using the same control register. Here is a list of all the modes and how they interact together:

Gen\_st is a bit taken from the interface between the SC and the SQ. This is the MSB of the primitive type. If the bit is set, it means we are dealing with Point AA, Line AA or sprite and in this case the vertex values are going to generated between 0 and 1.

Param\_Gen\_I0 disable, snd\_xy disable, no gen\_st - I0 = No modification Param\_Gen\_I0 disable, snd\_xy disable, gen\_st - I0 = No modification

Param\_Gen\_I0 disable, snd\_xy enable, no gen\_st - I0 = No modification

Param\_Gen\_I0 disable, snd\_xy enable, gen\_st - I0 = No modification

Param\_Gen\_I0 enable, snd\_xy disable, no gen\_st - I0 = garbage, garbage, garbage, faceness



EDIT DATE

R400 Sequencer Specification

PAGE 36 of 54

24 September, 2001

4 September, 20152

Param\_Gen\_I0 enable, snd\_xy disable, gen\_st – I0 = garbage, garbage, s, t

Param\_Gen\_l0 enable, snd\_xy enable, no gen\_st - I0 = screen x, screen y, garbage, faceness

Param\_Gen\_I0 enable, snd\_xy enable, gen\_st - I0 = screen x, screen y, s, t

### 19.3 Auto generated counters

In the cases we are dealing with multipass shaders, the sequencer is going to generate a vector count to be able to both use this count to write the 1<sup>st</sup> pass data to memory and then use the count to retrieve the data on the 2<sup>nd</sup> pass. The count is always generated in the same way but it is passed to the shader in a slightly different way depending on the shader type (pixel or vertex). This is toggled on and off using the GEN\_INDEX register. The sequencer is going to keep two counters, one for pixels and one for vertices. Every time a full vector of vertices or pixels is written to the GPRs the counter is incremented. Every time a state change is detected, the corresponding counter is reset. While there is only one count broadcast to the GPRs, the LSB are hardwired to specific values making the index different for all elements in the vector.

# 19.3.1 Vertex shaders

In the case of vertex shaders, if GEN\_INDEX is set, the data will be put into the x field of the third register (it means that the compiler must allocate 3 GPRs in all multipass vertex shader modes).

### 19.3.2 Pixel shaders

In the case of pixel shaders, if GEN\_INDEX is set and Param\_Gen\_I0 is enabled, the data will be put in the x field of the  $2^{nd}$  register (R1.x), else if GEN\_INDEX is set the data will be put into the x field of the  $1^{st}$  register (R0.x).



Figure 12: GPR input mux Control

### 20. State management

Every clock, the sequencer will report to the CP the oldest states still in the pipe. These are the states of the programs as they enter the last ALU clause.

### 20.1 Parameter cache synchronization

In order for the sequencer not to begin a group of pixels before the associated group of vertices has finished, the sequencer will keep a 6 bit count per state (for a total of 8 counters). These counters are initialized to 0 and every



**EDIT DATE** 4 September, 20152 DOCUMENT-REV. NUM. GEN-CXXXXX-REVA

**PAGE** 37 of 54

time a vertex shader exports its data TO THE PARAMETER CACHE, the corresponding pointer is incremented. When the SC sends a new vector of pixels with the SC\_SQ\_new\_vector bit asserted, the sequencer will first check if the count is greater than 0 before accepting the transmission (it will in fact accept the transmission but then lower its ready to receive). Then the sequencer waits for the count to go to one and decrements it. The sequencer can then issue the group of pixels to the interpolators. Every time the state changes, the new state counter is initialized to 0.

### 21. XY Address imports

The SC will be able to send the XY addresses to the GPRs. It does so by interleaving the writes of the IJs (to the IJ buffer) with XY writes (to the XY buffer). Then when writing the data to the GPRs, the sequencer is going to interpolate the IJ data or pass the XY data thru a Fix—float converter and expander and write the converted values to the GPRs. The Xys are currently SCREEN SPACE COORDINATES. The values in the XY buffers will wrap. See section 19.2 for details on how to control the interpolation in this mode.

### 21.1 Vertex indexes imports

In order to import vertex indexes, we have 16 8x96 staging registers. These are loaded one line at a time by the VGT block (96 bits). They are loaded in floating point format and can be transferred in 4 or 8 clocks to the GPRs.

## 22. Registers

## 22.1 Control

REG\_DYNAMIC

Dynamic allocation (pixel/vertex) of the register file on or off.

REG\_SIZE\_PIX REG SIZE VTX Size of the register file's pixel portion (minimal size when dynamic allocation turned

Size of the register file's vertex portion (minimal size when dynamic allocation turned

on)

ARBITRATION POLICY

policy of the arbitration between vertexes and pixels INST\_BASE\_VTX

start point for the vertex instruction store (RT always ends at vertex\_base and

Begins at 0)

INST\_BASE\_PIX start point for the pixel shader instruction store

ONE\_THREAD debug state register. Only allows one program at a time into the GPRs

ONE\_ALU

debug state register. Only allows one ALU program at a time to be executed (instead

INSTRUCTION

This is where the CP puts the base address of the instruction writes and type (auto-

incremented on reads/writes) Register mapped

512\*4 ALU constants + 32\*6 Texture state 32 bits registers (logically mapped) CONSTANTS

CONSTANTS RT 256\*4 ALU constants + 32\*6 texture states? (physically mapped)

CONSTANT EO RT This is the size of the space reserved for real time in the constant store (from 0 to

CONSTANT\_EO\_RT). The re-mapping table operates on the rest of the memory

This is the size of the space reserved for real time in the fetch state store (from 0 to

TSTATE\_EO\_RT). The re-mapping table operates on the rest of the memory

## 22.2 Context

TSTATE EO RT

PS\_BASE base pointer for the pixel shader in the instruction store VS BASE base pointer for the vertex shader in the instruction store VS\_CF\_SIZE size of the vertex shader (# of instructions in control program/2) PS CF SIZE size of the pixel shader (# of instructions in control program/2)

PS\_SIZE size of the pixel shader (cntl+instructions) VS\_SIZE size of the vertex shader (cntl+instructions)

PS\_NUM\_REG number of GPRs to allocate for pixel shader programs VS NUM REG number of GPRs to allocate for vertex shader programs

PARAM SHADE One 16 bit register specifying which parameters are to be gouraud shaded (0 = flat, 1

= gouraud)

PARAM\_WRAP 64 bits: for which parameters (and channels (xyzw)) do we do the cyl wrapping

(0=linear, 1=cylindrical).

PS EXPORT MODE 0xxxx: Normal mode



ORIGINATE DATE

**EDIT DATE** 

R400 Sequencer Specification

PAGE 38 of 54

24 September, 2001 4 September, 20152

1xxxx : Multipass mode

If normal, bbbz where bbb is how many colors (0-4) and z is export z or not

If multipass 1-12 exports for color.

VS\_EXPORT\_MODE 0: position (1 vector), 1: position (2 vectors), 3:multipass

VS\_EXPORT\_COUNT Number of locations exported by the VS (and thus number of interpolated

parameters) PARAM\_GEN\_I0 GEN\_INDEX

Do we overwrite or not the parameter 0 with XY data and generated T and S values Auto generates an address from 0 to XX. Puts the results into R0-1 for pixel shaders

and R2 for vertex shaders

CONST\_BASE\_VTX (9 bits) Logical Base address for the constants of the Vertex shader CONST\_BASE\_PIX (9 bits) Logical Base address for the constants of the Pixel shader CONST\_SIZE\_PIX (8 bits) Size of the logical constant store for pixel shaders CONST\_SIZE\_VTX (8 bits) Size of the logical constant store for vertex shaders

INST\_PRED\_OPTIMIZE Turns on the predicate bit optimization (if of, conditional\_execute\_predicates is

always executed). CF\_BOOLEANS 256 boolean bits

CF\_LOOP\_COUNT
CF\_LOOP\_START
32x8 bit counters (number of times we traverse the loop)
32x8 bit counters (init value used in index computation)
CF\_LOOP\_STEP
32x8 bit counters (step value used in index computation)

### 23. DEBUG Registers

#### 23.1 Context

DB\_PROB\_ADDR instruction address where the first problem occurred

DB\_PROB\_COUNT number of problems encountered during the execution of the program

DB\_PROB\_BREAK
DB\_ON
DB\_INST\_COUNT
DB\_BREAK\_ADDR
break the clause if an error is found.
turns on an off debug method 2
instruction counter for debug method 2
break address for method number 2

#### 23.2 Control

DB\_ALUCST\_MEMSIZE Size of the physical ALU constant memory
DB\_TSTATE\_MEMSIZE Size of the physical texture state memory

### 24. Interfaces

#### 24.1 External Interfaces

Whenever an x is used, it means that the bus is broadcast to all units of the same name. For example, if a bus is named  $SQ \rightarrow SPx$  it means that SQ is going to broadcast the same information to all SP instances.

#### 24.2 SC to SP Interfaces

## 24.2.1 SC\_SP#

There is one of these interfaces at front of each of the SP (buffer to stage pixel interpolators). This interface transmits the I,J data for pixel interpolation. For the entire system, two quads per clock are transferred to the 4 SPs, so each of these 4 interfaces transmits one half of a quad per clock. The interface below describes a half of a quad worth of data.

The actual data which is transferred per quad is Ref Pix I => S4.20 Floating Point I value Ref Pix J => S4.20 Floating Point J value

Exhibit 2029.docR400\_Sequencer.doc 73711 Bytes\*\*\* © ATI Confidential. Reference Copyright Notice on Cover Page © \*\*\*

AMD1044\_0257432



EDIT DATE 4 September, 20152 DOCUMENT-REV. NUM. GEN-CXXXXX-REVA

PAGE 39 of 54

Delta Pix I (x3) => S4.8 Floating Point Delta I value
Delta Pix J (x3) => S4.8 Floating Point Delta J value
This equates to a total of 128 bits which transferred over 2 clocks
and therefor needs an interface 64 bits wide

Additionally, X,Y data (12-bit unsigned fixed) is conditionally sent across this data bus over the same wires in an additional clock. The X,Y data is sent on the lower 24 bits of the data bus with faceness in the msb. Transfers across these interfaces are synchronized with the SC\_SQ IJ Control Bus transfers.

The data transfer across each of these busses is controlled by a IJ\_BUF\_INUSE\_COUNT in the SC. Each time the SC has sent a pixel vector's worth of data to the SPs, he will increment the IJ\_BUF\_INUSE\_COUNT count. Prior to sending the next pixel vectors data, he will check to make sure the count is less than MAX\_BUFER\_MINUS\_2, if not the SC will stall until the SQ returns a pipelined pulse to decrement the count when he has scheduled a buffer free. Note: We could/may optimize for the case of only sending only IJ to use all the buffers to pre-load more. Currently it is planned for the SP to hold 2 double buffers of I,J data and two buffers of X,Y data, so if either X,Y or Centers and Centroids are on, then the SC can send two Buffers.

In at least the initial version, the SC shall send 16 quads per pixel vector even if the vector is not full. This will increment buffer write address pointers correctly all the time. (We may revisit this for both the SX,SP,SQ and add a EndOfVector signal on all interfaces to quit early. We opted for the simple mode first with a belief that only the end of packet and multiple new vector signals should cause a partial vector and that this would not really be significant performance hit.)

| Name                  | Bits | Description                                                                                                                                                                                         |  |  |  |
|-----------------------|------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--|--|--|
| SC_SP#_data           | 64   | IJ information sent over 2 clocks (or X,Y in 24 LSBs with faceness in upper bit)  Type 0 or 1, First clock I, second clk J                                                                          |  |  |  |
|                       |      | Field ULC URC LLC LRC Bits [63:39] [38:26] [25:13] [12:0] Format SE4M20 SE4M8 SE4M8 SE4M8  Type 2 Field Face X Y Bits [63] [23:12] [11:0] Format Bit Unsigned Unsigned                              |  |  |  |
| SC_SP#_valid          | 1    | Valid                                                                                                                                                                                               |  |  |  |
| SC_SP#_last_quad_data | 1    | This bit will be set on the last transfer of data per quad.                                                                                                                                         |  |  |  |
| SC_SP#_type           | 2    | O -> Indicates centroids 1 -> Indicates centers 2 -> Indicates X,Y Data and faceness on data bus The SC shall look at state data to determine how many types to send for the interpolation process. |  |  |  |

The # is included for clarity in the spec and will be replaced with a prefix of u#\_ in the verilog module statement for the SC and the SP block will have neither because the instantiation will insert the prefix.

### 24.2.2 SC SQ

This is the control information sent to the sequencer in order to synchronize and control the interpolation and/or loading data into the GPRs needed to execute a shader program on the sent pixels. This data will be sent over two clocks per transfer with 1 to 16 transfers. Therefore the bus (approx 92 bits) could be folded in half to approx 47 bits.

| Name       | Bits | Description                                |  |  |  |
|------------|------|--------------------------------------------|--|--|--|
| SC_SQ_data | 46   | Control Data sent to the SQ                |  |  |  |
|            |      | 1 clk transfers                            |  |  |  |
|            |      | Event – valid data consist of event_id and |  |  |  |
|            |      | state_id. Instruct SQ to post an           |  |  |  |
|            |      | event vector to send state id and          |  |  |  |
|            |      | event_id through request fifo              |  |  |  |
|            |      | and onto the reservation stations          |  |  |  |

|             |                    |     | .,                                                                                                                                                                                                                                                    |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    |          |
|-------------|--------------------|-----|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------|
|             | ORIGINATE D        | ATE | EDIT DATE                                                                                                                                                                                                                                             | R400 Sequencer Specification                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       | PAGE     |
| 6,400       | 24 September, 2001 |     | 4 September, 20152                                                                                                                                                                                                                                    |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    | 40 of 54 |
|             |                    |     | making gets ba follow e vectors  Empty Quad Mask — consist or new transfe without vector reques: attache outstar if no va  2 clk transfers Quad Data Valid — Si without New vector vithout sea vector this cas posted Filler quantity correspication. | sure state id and/or event_id ck to the CP. Events only and of packets so no pixel will be in progress.  Transfer Control data ng of pc_dealloc vector. Receipt of this is to pc_dealloc or new_vector any valid quad data. New will always be posted to any pixel vector ding or posted in request fifolid quad outstanding.  ending quad data with or new_vector or pc_dealloc. ctor will be posted to request or without a pixel vector and lloc will be posted with a pixel unless none is in progress. In the the pc_dealloc will be in the request queue. Lads will be transferred with and mask set but the pixel conding pixel mask set to |          |
| SC_SQ_valid |                    | 1   | SC sending valid data, 2 <sup>nd</sup>                                                                                                                                                                                                                | clk could be all zeroes                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |          |

SC\_SQ\_data – first clock and second clock transfers are shown in the table below.

| Name                           | BitField | Bits | Description                                                     |  |
|--------------------------------|----------|------|-----------------------------------------------------------------|--|
|                                |          |      |                                                                 |  |
| 1 <sup>st</sup> Clock Transfer |          |      |                                                                 |  |
| SC_SQ_event                    | 0        | 1    | This transfer is a 1 clock event vector                         |  |
|                                |          |      | Force quad_mask = new_vector=pc_dealloc=0                       |  |
| SC_SQ_event_id                 | [2:1]    | 2    | This field identifies the event                                 |  |
|                                |          |      | 0 => denotes an End Of State Event                              |  |
|                                |          |      | 1 => TBD                                                        |  |
| SC_SQ_pc_dealloc               | [5:3]    | 3    | Deallocation token for the Parameter Cache                      |  |
| SC_SQ_new_vector               | 6        | 1    | The SQ must wait for Vertex shader done count > 0 and after     |  |
|                                |          |      | dispatching the Pixel Vector the SQ will decrement the count.   |  |
| SC_SQ_quad_mask                | [10:7]   | 4    | Quad Write mask left to right SP0 => SP3                        |  |
| SC_SQ_end_of_prim              | 11       | 1    | End Of the primitive                                            |  |
| SC_SQ_state_id                 | [14:12]  | 3    | State/constant pointer (6*3+3)                                  |  |
| SC_SQ_pix_mask                 | [30:15]  | 16   | Valid bits for all pixels SP0=>SP3 (UL,UR,LL,LR)                |  |
| SC_SQ_prim_type                | [33:31]  | 3    | Stippled line and Real time command need to load tex cords from |  |
|                                |          |      | alternate buffer                                                |  |
|                                |          |      | 000: Normal                                                     |  |
|                                |          |      | 010: Realtime                                                   |  |
|                                |          |      | 101: Line AA                                                    |  |
|                                |          |      | 110: Point AA (Sprite)                                          |  |
| SC_SQ_provok_vtx               | [35:34]  | 2    | Provoking vertex for flat shading                               |  |
| SC_SQ_pc_ptr0                  | [46:36]  | 11   | Parameter Cache pointer for vertex 0                            |  |
| 2nd Clock Transfer             |          |      |                                                                 |  |
| SC_SQ_pc_ptr1                  | [10:0]   | 11   | Parameter Cache pointer for vertex 1                            |  |

| Æ                         | ORIGINATE<br>24 Septembe | _, | -                                    | EDIT DATE eptember, 20152 | DOCUMENT-REV. NUM.<br>GEN-CXXXXX-REVA | PAGE<br>41 of 54 |
|---------------------------|--------------------------|----|--------------------------------------|---------------------------|---------------------------------------|------------------|
| SC_SQ_pc_ptr2 [21:11]     |                          | 11 | Parameter Cache pointer for vertex 2 |                           |                                       |                  |
| SC_SQ_lod_correct [45:22] |                          | 24 | LOD correction per                   | quad (6 bits per quad)    |                                       |                  |

| Name               | Bits | Description                                                                   | ı |
|--------------------|------|-------------------------------------------------------------------------------|---|
| SQ_SC_free_buff    | 1    | Pipelined bit that instructs SC to decrement count of buffers in use.         | ľ |
| SQ_SC_dec_cntr_cnt | 1    | Pipelined bit that instructs SC to decrement count of new vector and/or event | Ľ |
|                    |      | sent to prevent SC from overflowing SQ interpolator/Reservation request fifo. |   |

The scan converter will submit a partial vector whenever:

- 1.) He gets a primitive marked with an end of packet signal.
- 2.) A current pixel vector is being assembled with at least one or more valid quads and the vector has been marked for deallocate when a primitive marked new\_vector arrives. The Scan Converter will submit a partial vector (up to 16quads with zero pixel mask to fill out the vector) prior to submitting the new\_vector marker\primitive.

(This will prevent a hang which can be demonstrated when all primitives in a packet three vectors are culled except for a one quad primitive that gets marked pc\_dealloc (vertices maximum size). In this case two new\_vectors are submitted and processed, but then one valid quad with the pc\_dealloc creates a vector and then the new would wait for another vertex vector to be processed, but the one being waited for could never export until the pc\_dealloc signal made it through and thus the hang.)

#### 24.2.3 SQ to SX: Interpolator bus

| Name                       | Direction | Bits | Description                                  | 1 |
|----------------------------|-----------|------|----------------------------------------------|---|
| SQ_SXx_interp_flat_vtx     | SQ→SPx    | 2    | Provoking vertex for flat shading            | 1 |
| SQ_SXx_interp_flat_gouraud | SQ→SPx    | 1    | Flat or gouraud shading                      | 7 |
| SQ_SXx_interp_cyl_wrap     | SQ→SPx    | 4    | Wich channel needs to be cylindrical wrapped | 1 |
| SQ_SXx_pc_ptr0             | SQ→SXx    | 11   | Parameter Cache Pointer                      | 1 |
| SQ_SXx_pc_ptr1             | SQ→SXx    | 11   | Parameter Cache Pointer                      | 1 |
| SQ_SXx_pc_ptr2             | SQ→SXx    | 11   | Parameter Cache Pointer                      | ] |
| SQ_SXx_rt_sel              | SQ→SXx    | 1    | Selects between RT and Normal data           |   |
| SQ_SXx_pc_wr_en            | SQ→SXx    | 1    | Write enable for the PC memories             | ] |
| SQ_SXx_pc_wr_addr          | SQ→SXx    | 7    | Write address for the PCs                    |   |
| SQ_SXx_pc_channel_mask     | SQ→SXx    | 4    | Channel mask                                 | - |

### 24.2.4 SQ to SP: Staging Register Data

This is a broadcast bus that sends the VSISR information to the staging registers of the shader pipes.

| Name               | Direction | Bits | Description                                            |
|--------------------|-----------|------|--------------------------------------------------------|
| SQ_SPx_vsr_data    | SQ→SPx    | 96   | Pointers of indexes or HOS surface information         |
| SQ_SPx_vsr_double  | SQ→SPx    | 1    | 0: Normal 96 bits per vert 1: double 192 bits per vert |
| SQ_SP0_ vsr_valid  | SQ-→SP0   | 1    | Data is valid                                          |
| SQ_SP1_ vsr_ valid | SQ→SP1    | 1    | Data is valid                                          |
| SQ_SP2_vsr_valid   | SQ→SP2    | 1    | Data is valid                                          |
| SQ_SP3_ vsr_ valid | SQ→SP3    | 1    | Data is valid                                          |
| SQ_SPx_vsr_read    | SQ→SPx    | 1    | Increment the read pointers                            |

#### 24.2.5 VGT to SQ: Vertex interface

#### 24.2.5.1 Interface Signal Table

The area difference between the two methods is not sufficient to warrant complicating the interface or the state requirements of the VSISRs. Therefore, the POR for this interface is that the VGT will transmit the data to the VSISRs (via the Shader Sequencer) in full, 32-bit floating-point format. The VGT can transmit up to six 32-bit floating-point values to each VSISR where four or more values require two transmission clocks. The data bus is 96 bits wide.

|   | AR I                | ORIGINATE DATE |                                                                              | EDIT DATE                                                                           | R400 Sequencer Specification     | PAGE          |  |  |
|---|---------------------|----------------|------------------------------------------------------------------------------|-------------------------------------------------------------------------------------|----------------------------------|---------------|--|--|
|   | 67°UU               | 24 September   | r, 2001                                                                      | 4 September, 20152                                                                  |                                  | 42 of 54      |  |  |
|   | Name                |                | Bits                                                                         | Description                                                                         |                                  |               |  |  |
| ١ | VGT_SQ_vsisr        | _data          | 96                                                                           | Pointers of indexes or HOS                                                          | surface information              |               |  |  |
|   | VGT_SQ_vsisr        | _double        | 1                                                                            | 0: Normal 96 bits per vert 1                                                        | : double 192 bits per vert       |               |  |  |
|   | VGT_SQ_end_         | of_vector      | 1                                                                            | Indicates the last VSISR data set for the current process vector (for double vector |                                  |               |  |  |
|   |                     |                |                                                                              | data, "end_of_vector" is set on the first vector)                                   |                                  |               |  |  |
|   | VGT SQ_indx_valid 1 |                |                                                                              | Vsisr data is valid                                                                 |                                  |               |  |  |
|   | VGT_SQ_state        | )              | 3                                                                            | Render State (6*3+3 for constants). This signal is guaranteed to be correct when    |                                  |               |  |  |
|   |                     |                |                                                                              | "VGT_SQ_vgt_end_of_vector" is high.                                                 |                                  |               |  |  |
|   | VGT_SQ_send 1       |                | Data on the VGT_SQ is valid receive (see write-up for standard R400 SEND/RTR |                                                                                     |                                  |               |  |  |
|   |                     |                |                                                                              | interface handshaking)                                                              |                                  |               |  |  |
|   | SQ_VGT_rtr 1        |                |                                                                              |                                                                                     | write-up for standard R400 SEND/ | RTR interface |  |  |
|   |                     |                |                                                                              | handshaking)                                                                        |                                  |               |  |  |

## 24.2.5.2 Interface Diagrams



Figure 1. Detailed Logical Diagram for PA SQ vgt Interface.

Enhibit 2029 doc R400\_Sequencer doc 73711 Вукез\*\*\* © ATI Confidential. Reference Copyright Notice on Cover Page © \*\*\*



EDIT DATE

4 September, 20152

DOCUMENT-REV. NUM. GEN-CXXXXX-REVA

PAGE 45 of 54

#### 24.2.6 SQ to SX: Control bus

| Name               | Direction | Bits | Description                                                                                                                                    |
|--------------------|-----------|------|------------------------------------------------------------------------------------------------------------------------------------------------|
| SQ_SXx_exp_type    | SQ→SXx    | 2    | 00: Pixel without z (1 to 4 buffers) 01: Pixel with z (1 to 4 buffers) 10: Position (1 or 2 results) 11: Pass thru (4,8 or 12 results aligned) |
| SQ_SXx_exp_number  | SQ→SXx    | 2    | Number of locations needed in the export buffer (encoding depends on the type see bellow).                                                     |
| SQ_SXx_exp_alu_id  | SQ→SXx    | 1    | ALU ID                                                                                                                                         |
| SQ_SXx_exp_valid   | SQ→SXx    | 1    | Valid bit                                                                                                                                      |
| SQ_SXx_exp_state   | SQ→SXx    | 3    | State Context                                                                                                                                  |
| SQ_SXx_free_done   | SQ→SXx    | 1    | Pulse to indicate that the previous export is finished (this can be sent with or without the other fields of the interface)                    |
| SQ_SXx_free_alu_id | SQ→SXx    | 1    | ALU ID                                                                                                                                         |

Depending on the type the number of export location changes:

- Type 00 : Pixels without Z
  - o 00 = 1 buffer
  - o 01 = 2 buffers
  - o 10 = 3 buffers
  - o 11 = 4 buffer
- Type 01: Pixels with Z
  - o 00 = 2 Buffers (color + Z)
  - o 01 = 3 buffers (2 color + Z)
  - o 10 = 4 buffers (3 color + Z)
  - o 11 = 5 buffers (4 color + Z)
- Type 10 : Position export
  - o 00 = 1 position
  - o 01 = 2 positions
  - 1X = Undefined
- Type 11: Pass Thru
  - o 00 = 4 buffers
  - o 01 = 8 buffers
  - 10 = 12 buffers
     11 = Undefined

Below the thick black line is the end of transfer packet that tells the SX that a given export is finished. The report packet will always arrive either before or at the same time than the next export to the same ALU id.

### 24.2.7 SX to SQ: Output file control

| Name                 | Direction | Bits | Description                                                                                                                                                                                                         |
|----------------------|-----------|------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| SXx_SQ_exp_count_rdy | SXx→SQ    | 1    | Raised by SX0 to indicate that the following two fields reflect the result of the most recent export                                                                                                                |
| SXx_SQ_exp_pos_avail | SXx→SQ    | 1    | Specifies whether there is room for another position.                                                                                                                                                               |
| SXx_SQ_exp_buf_avail | SXx→SQ    | 7    | Specifies the space available in the output buffers.  0: buffers are full  1: 2K-bits available (32-bits for each of the 64 pixels in a clause)  64: 128K-bits available (16 128-bit entries for each of 64 pixels) |
|                      |           |      | 65-127: RESERVED                                                                                                                                                                                                    |



EDIT DATE
4 September, 20152

R400 Sequencer Specification

PAGE 46 of 54

### 24.2.8 SQ to TP: Control bus

Once every clock, the fetch unit sends to the sequencer on which RS line it is now working and if the data in the GPRs is ready or not. This way the sequencer can update the fetch valid bits flags for the reservation station. The sequencer also provides the instruction and constants for the fetch to execute and the address in the register file where to write the fetch return data.

| Name                   | Direction | Bits | Description                                               |
|------------------------|-----------|------|-----------------------------------------------------------|
| TPx_SQ_data_rdy        | TPx→ SQ   | 1    | Data ready                                                |
| TPx_SQ_rs_line_num     | TPx→ SQ   | 6    | Line number in the Reservation station                    |
| TPx_SQ_type            | TPx→ SQ   | 1    | Type of data sent (0:PIXEL, 1:VERTEX)                     |
| SQ_TPx_send            | SQ→TPx    | 1    | Sending valid data                                        |
| SQ_TPx_const           | SQ→TPx    | 48   | Fetch state sent over 4 clocks (192 bits total)           |
| SQ_TPx_instr           | SQ→TPx    | 24   | Fetch instruction sent over 4 clocks                      |
| SQ_TPx_end_of_group    | SQ→TPx    | 1    | Last instruction of the group                             |
| SQ_TPx_Type            | SQ→TPx    | 1    | Type of data sent (0:PIXEL, 1:VERTEX)                     |
| SQ_TPx_gpr_phase       | SQ→TPx    | 2    | Write phase signal                                        |
| SQ_TP0_lod_correct     | SQ→TP0    | 6    | LOD correct 3 bits per comp 2 components per quad         |
| SQ_TP0_pix_mask        | SQ→TP0    | 4    | Pixel mask 1 bit per pixel                                |
| SQ_TP1_lod_correct     | SQ→TP1    | 6    | LOD correct 3 bits per comp 2 components per quad         |
| SQ_TP1_pix_mask        | SQ→TP1    | 4    | Pixel mask 1 bit per pixel                                |
| SQ_TP2_lod_correct     | SQ→TP2    | 6    | LOD correct 3 bits per comp 2 components per quad         |
| SQ_TP2_pix_mask        | SQ→TP2    | 4    | Pixel mask 1 bit per pixel                                |
| SQ_TP3_lod_correct     | SQ→TP3    | 6    | LOD correct 3 bits per comp 2 components per quad         |
| SQ_TP3_pix_mask        | SQ→TP3    | 4    | Pixel mask 1 bit per pixel                                |
| SQ_TPx_rs_line_num     | SQ→TPx    | 6    | Line number in the Reservation station                    |
| SQ_TPx_write_gpr_index | SQ->TPx   | 7    | Index into Register file for write of returned Fetch Data |

### 24.2.9 TP to SQ: Texture stall

The TP sends this signal to the SQ and the SPs when its input buffer is full.



| Name              | Direction Bits |   |                                              |  |  |
|-------------------|----------------|---|----------------------------------------------|--|--|
| TP_SQ_fetch_stall | TP→ SQ         | 1 | Do not send more texture request if asserted |  |  |



EDIT DATE

4 September, 20152

DOCUMENT-REV. NUM. GEN-CXXXXX-REVA PAGE 47 of 54

### 24.2.10 SQ to SP: Texture stall

| Name               | Direction Bits |   | Description                                  |  |  |  |
|--------------------|----------------|---|----------------------------------------------|--|--|--|
| SQ_SPx_fetch_stall | SQ→SPx         | 1 | Do not send more texture request if asserted |  |  |  |

### 24.2.11 SQ to SP: GPR and auto counter

| Name                 | Direction | Bits | Description                                                                                                                      |
|----------------------|-----------|------|----------------------------------------------------------------------------------------------------------------------------------|
| SQ_SPx_gpr_wr_addr   | SQ→SPx    | 7    | Write address                                                                                                                    |
| SQ_SPx_gpr_rd_addr   | SQ→SPx    | 7    | Read address                                                                                                                     |
| SQ_SPx_gpr_rd_en     | SQ→SPx    | 1    | Read Enable                                                                                                                      |
| SQ SP0 gpr wr en     | SQ→SPx    | 1    | Write Enable for the GPRs of SP0                                                                                                 |
| SQ SP1 gpr wr en     | SQ→SPx    | 1    | Write Enable for the GPRs of SP1                                                                                                 |
| SQ SP2 gpr wr en     | SQ→SPx    | 1    | Write Enable for the GPRs of SP2                                                                                                 |
| SQ_SPx3_gpr_wr_en    | SQ→SPx    | 1    | Write Enable for the GPRs of SP3                                                                                                 |
| SQ_SPx_gpr_phase     | SQ→SPx    | 2    | The phase mux (arbitrates between inputs, ALU SRC                                                                                |
|                      |           |      | reads and writes)                                                                                                                |
| SQ_SPx_channel_mask  | SQ→SPx    | 4    | The channel mask                                                                                                                 |
| SQ_SPx_gpr_input_sel | SQ→SPx    | 2    | When the phase mux selects the inputs this tells from which source to read from: Interpolated data, VTX0, VTX1, autogen counter. |
| SQ_SPx_auto_count    | SQ→SPx    | 12?  | Auto count generated by the SQ, common for all shader pipes                                                                      |



EDIT DATE
4 September, 20152

R400 Sequencer Specification

PAGE 48 of 54

### 24.2.12 SQ to SPx: Instructions

| Name                                  | Direction        | Bits | Description                                                                                                                                                     |
|---------------------------------------|------------------|------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------|
| SQ_SPx_instr_start                    | SQ→SPx           | 1    | Instruction start                                                                                                                                               |
| SQ_SP_instr                           | SQ→SPx           | 21   | Transferred over 4 cycles   2:0   SRC A Select   3:3   SRC A swizzle   11:4   VectorDst   17:12   Unused   20:18                                                |
|                                       |                  |      | - 1: SRC B Select 2:0     SRC B Argument Modifier 3:3     SRC B swizzle 11:4     ScalarDst 17:12     Unused 20:18                                               |
|                                       |                  |      | 2: SRC C Select 2:0 SRC C Argument Modifier 3:3 SRC C swizzle 11:4 Unused 20:12                                                                                 |
|                                       |                  |      |                                                                                                                                                                 |
| 00.00                                 | 00 00            |      | Scalar Write Mask 20:17                                                                                                                                         |
| SQ_SPx_exp_alu_id<br>SQ_SPx_exporting | SQ→SPx<br>SQ→SPx | 2    | ALU ID 0: Not Exporting 1: Vector Exporting 2: Scalar Exporting                                                                                                 |
| SQ SPx stall                          | SQ→SPx           | 1    | Stall signal                                                                                                                                                    |
| SQ_SP0_write_mask                     | SQ→SP0           | 4    | Result of pixel kill in the shader pipe, which must be output for all pixel exports (depth and all color buffers). 4x4 because 16 pixels are computed per clock |
| SQ_SP1_ write_mask                    | SQ→SP1           | 4    | Result of pixel kill in the shader pipe, which must be output for all pixel exports (depth and all color buffers). 4x4 because 16 pixels are computed per clock |
| SQ_SP2_ write_mask                    | SQ→SP2           | 4    | Result of pixel kill in the shader pipe, which must be output for all pixel exports (depth and all color buffers). 4x4 because 16 pixels are computed per clock |
| SQ_SP3_ write_mask                    | SQ→SP3           | 4    | Result of pixel kill in the shader pipe, which must be output for all pixel exports (depth and all color buffers). 4x4 because 16 pixels are computed per clock |

## 24.2.13 SP to SQ: Constant address load/Predicate Set

| Name              | Direction | Bits | Description                                                 |  |  |
|-------------------|-----------|------|-------------------------------------------------------------|--|--|
| SP0_SQ_const_addr | SP0→SQ    | 36   | Constant address load / predicate vector load (4 bits only) |  |  |
|                   |           |      | to the sequencer                                            |  |  |
| SP0_SQ_valid      | SP0→SQ    | 1    | Data valid                                                  |  |  |
| SP1_SQ_const_addr | SP1→SQ    | 36   | Constant address load / predicate vector load (4 bits only) |  |  |
|                   |           |      | to the sequencer                                            |  |  |



### 24.2.14 SQ to SPx: constant broadcast

| Name         | Direction | Bits | Description        |  |  |
|--------------|-----------|------|--------------------|--|--|
| SQ SPx const | SQ→SPx    | 128  | Constant broadcast |  |  |

#### 24.2.15 SP0 to SQ: Kill vector load

| Name             | Direction | Bits | Description      | - |
|------------------|-----------|------|------------------|---|
| SP0_SQ_kill_vect | SP0→SQ    | 4    | Kill vector load | 1 |
| SP1_SQ_kill_vect | SP1→SQ    | 4    | Kill vector load | 1 |
| SP2_SQ_kill_vect | SP2→SQ    | 4    | Kill vector load | 1 |
| SP3_SQ_kill_vect | SP3→SQ    | 4    | Kill vector load | 1 |

### 24.2.16 SQ to CP: RBBM bus

| Name           | Direction | Bits | Description          |   |  |  |  |
|----------------|-----------|------|----------------------|---|--|--|--|
| SQ_RBB_rs      | SQ→CP     | 1    | Read Strobe          | 1 |  |  |  |
| SQ_RBB_rd      | SQ→CP     | 32   | Read Data            | ٦ |  |  |  |
| SQ_RBBM_nrtrtr | SQ→CP     | 1    | Optional             | 7 |  |  |  |
| SQ_RBBM_rtr    | SQ→CP     | 1    | Real-Time (Optional) | ٦ |  |  |  |

#### 24.2.17 CP to SQ: RBBM bus

| Name               | Direction | Bits | Description                        |
|--------------------|-----------|------|------------------------------------|
| rbbm_we            | CP→SQ     | 1    | Write Enable                       |
| rbbm_a             | CP→SQ     | 15   | Address Upper Extent is TBD (16:2) |
| rbbm_wd            | CP→SQ     | 32   | Data                               |
| rbbm_be            | CP→SQ     | 4    | Byte Enables                       |
| rbbm_re            | CP→SQ     | 1    | Read Enable                        |
| rbb_rs0            | CP→SQ     | 1    | Read Return Strobe 0               |
| rbb_rs1            | CP→SQ     | 1    | Read Return Strobe 1               |
| rbb_rd0            | CP→SQ     | 32   | Read Data 0                        |
| rbb_rd1            | CP→SQ     | 32   | Read Data 0                        |
| RBBM_SQ_soft_reset | CP→SQ     | 1    | Soft Reset                         |

### 24.2.18 SQ to CP: State report

| Name             | Direction | Bits | Description            | 7 |
|------------------|-----------|------|------------------------|---|
| SQ_CP_vs_event   | SQ→CP     | 1    | Vertex Shader Event    | 7 |
| SQ_CP_vs_eventid | SQ→CP     | 2    | Vertex Shader Event ID | 1 |
| SQ_CP_ps_event   | SQ→CP     | 1    | Pixel Shader Event     | 1 |
| SQ_CP_ps_eventid | SQ→CP     | 2    | Pixel Shader Event ID  |   |

eventid = 0 => \*sEndOfState (i.e. VsEndOfState) eventid = 1 => \*sDone (i.e. VsDone)

So, the CP will assume the Vs is done with a state whenever it gets a pulse on the SQ\_CP\_vs\_event and the SQ\_CP\_vs\_eventid = 0.



EDIT DATE 4 September, 20152 R400 Sequencer Specification

PAGE 50 of 54

24.3 Example of control flow program execution

We now provide some examples of execution to better illustrate the new design.

#### Given the program:

Alu 0

Alu 1

Tex 0

Tex 1

Alu 3 Serial

Alu 4

Tex 2 Alu 5

Alu 6 Serial

Tex 3 Alu 7

Alloc Position 1 buffer

Alu 8 Export

Tex 4

Alloc Parameter 3 buffers

Alu 9 Export 0

Tex 5

Alu 10 Serial Export 2

Alu 11 Export 1 End

#### Would be converted into the following CF instructions:

Execute Alu 0 Alu 0 Tex 0 Tex 0 Alu 1 Alu 0 Tex 0 Alu 0 Alu 1 Tex 0 Execute Alu 0 Alloc Position 1 Execute Alu 0 Tex 0 Alloc Param 3 Execute Alu 0 Tex 0 Alu 1 Alu 0 End

#### And the execution of this program would look like this:

#### Put thread in Vertex RS:

Control Flow Instruction Pointer (12 bits), (CFP) Execution Count Marker (3 or 4 bits), (ECM) Loop Iterators (4x9 bits), (LI) Call return pointers (4x12 bits), (CRP) Predicate Bits(4x64 bits), (PB) Export ID (1 bit), (EXID) GPR Base Ptr (8 bits), (GPR) Export Base Ptr (7 bits), (EB) Context Ptr (3 bits).(CPTR) LOD correction bits (16x6 bits) (LOD)

| State Bits | State Bits |    |     |    |      |     |    |      |     |  |  |
|------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----|-----|----|------|-----|----|------|-----|--|--|
| CFP        |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | LI | CRP | PB | EXID | GPR | EB | CPTR | LOD |  |  |
| 0          | 0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             | 0  | 0   | 0  | 0    | 0   | 0  | 0    | 0   |  |  |

Valid Thread (VALID) Texture/ALU engine needed (TYPE) Texture Reads are outstanding (PENDING) Waiting on Texture Read to Complete (SERIAL) Allocation Wait (2 bits) (ALLOC)



EDIT DATE

4 September, 20152

DOCUMENT-REV. NUM. GEN-CXXXXX-REVA PAGE 51 of 54

00 – No allocation needed

01 - Position export allocation needed (ordered export)

10 - Parameter or pixel export needed (ordered export)

11 - pass thru (out of order export)

Allocation Size (4 bits) (SIZE)
Position Allocated (POS\_ALLOC)

First thread of a new context (FIRST)

Last (1 bit), (LAST)

| Status Bits |      |         |        |       |      |           |       |      |  |  |  |
|-------------|------|---------|--------|-------|------|-----------|-------|------|--|--|--|
| VALID       | TYPE | PENDING | SERIAL | ALLOC | SIZE | POS_ALLOC | FIRST | LAST |  |  |  |
| 1           | ALU  | 0       | 0      | 0     | 0    | 0         | 1     | 0    |  |  |  |

Then the thread is picked up for the execution of the first control flow instruction:

Execute Alu 0 Âlu 0 Tex 0 Tex 0 Alu 1 Alu 0 Tex 0 Alu 0 Alu 1 Tex 0

It executes the first two ALU instructions and goes back to the RS for a resource request change. Here is the state returned to the RS:

| State Bits |     |    |     |    |      |     |    |      |     |  |  |
|------------|-----|----|-----|----|------|-----|----|------|-----|--|--|
| CFP        | ECM | LI | CRP | PB | EXID | GPR | EB | CPTR | LOD |  |  |
| 0          | 2   | 0  | 0   | 0  | 0    | 0   | 0  | 0    | 0   |  |  |

| Status Bits |      |         |        |       |      |           |       |      |
|-------------|------|---------|--------|-------|------|-----------|-------|------|
| VALID       | TYPE | PENDING | SERIAL | ALLOC | SIZE | POS_ALLOC | FIRST | LAST |
| 1           | TEX  | 0       | 0      | 0     | 0    | 0         | 1     | 0    |

Then when the texture pipe frees up, the arbiter picks up the thread to issue the texture reads. The thread comes back in this state:

| State Bits |     |   |     |    |      |     |    |      |     |  |  |  |
|------------|-----|---|-----|----|------|-----|----|------|-----|--|--|--|
| CFP        | ECM | П | CRP | РВ | EXID | GPR | EB | CPTR | LOD |  |  |  |
| 0          | 4   | 0 | 0   | 0  | 0    | 0   | 0  | 0    | 0   |  |  |  |

| Status Bits |      |         |        |       |      |           |       |      |
|-------------|------|---------|--------|-------|------|-----------|-------|------|
| VALID       | TYPE | PENDING | SERIAL | ALLOC | SIZE | POS_ALLOC | FIRST | LAST |
| 1           | ALU  | 1       | 1      | 0     | 0    | 0         | 1     | 0    |

Because of the serial bit the arbiter must wait for the texture to return and clear the PENDING bit before it can pick the thread up. Lets say that the texture reads are complete, then the arbiter picks up the thread and returns it in this state:

| State Bits |     |    |     |    |      |     |    |      |     |  |  |  |  |
|------------|-----|----|-----|----|------|-----|----|------|-----|--|--|--|--|
| CFP        | ECM | LI | CRP | РВ | EXID | GPR | EB | CPTR | LOD |  |  |  |  |
| 0          | 6   | 0  | 0   | 0  | 0    | 0   | 0  | 0    | 0   |  |  |  |  |

| Status Bits |      |         |        |       |      |           |       |      |
|-------------|------|---------|--------|-------|------|-----------|-------|------|
| VALID       | TYPE | PENDING | SERIAL | ALLOC | SIZE | POS_ALLOC | FIRST | LAST |
| 1           | TEX  | 0       | 0      | 0     | 0    | 0         | 1     | 0    |

Again the TP frees up, the arbiter picks up the thread and executes. It returns in this state:



EDIT DATE 4 September, 20152 R400 Sequencer Specification

PAGE 52 of 54

|     |     | ~~~ |     | ~  |      |     |    |      |     |
|-----|-----|-----|-----|----|------|-----|----|------|-----|
|     |     |     |     |    |      |     |    |      |     |
| CFP | ECM | LI  | CRP | PB | EXID | GPR | EB | CPTR | LOD |
| 0   | 7   | 0   | 0   | 0  | 0    | 0   | 0  | 0    | 0   |

Status Bits

| VALID | TYPE | PENDING | SERIAL | ALLOC | SIZE | POS_ALLOC | FIRST | LAST |
|-------|------|---------|--------|-------|------|-----------|-------|------|
| 1     | ALU  | 1       | 0      | 0     | 0    | 0         | 1     | 0    |

Now, even if the texture has not returned we can still pick up the thread for ALU execution because the serial bit is not set. The thread will however come back to the RS for the second ALU instruction because it has the serial bit set.

C4-4- D:4

| State Bits | State Dits |    |     |    |      |     |    |      |     |  |  |  |  |
|------------|------------|----|-----|----|------|-----|----|------|-----|--|--|--|--|
| CFP        | ECM        | LI | CRP | РВ | EXID | GPR | EB | CPTR | LOD |  |  |  |  |
| 0          | 8          | 0  | 0   | 0  | 0    | 0   | 0  | 0    | 0   |  |  |  |  |

Status Rite

| Status bits |      |         |        |       |      |           |       |      |
|-------------|------|---------|--------|-------|------|-----------|-------|------|
| VALID       | TYPE | PENDING | SERIAL | ALLOC | SIZE | POS_ALLOC | FIRST | LAST |
| 1           | ALU  | 1       | 1      | 0     | 0    | 0         | 1     | 0    |

As soon as the TP clears the pending bit the thread is picked up and returns:

State Bits

| State Dits |     |    |     |    |      |     |    |      |     |  |  |
|------------|-----|----|-----|----|------|-----|----|------|-----|--|--|
| CFP        | ECM | LI | CRP | PB | EXID | GPR | EB | CPTR | LOD |  |  |
| 0          | 9   | 0  | 0   | 0  | 0    | 0   | 0  | 0    | 0   |  |  |

| Status Bits |      |         |        |       |      |           |       |      |  |  |  |  |  |
|-------------|------|---------|--------|-------|------|-----------|-------|------|--|--|--|--|--|
| VALID       | TYPE | PENDING | SERIAL | ALLOC | SIZE | POS_ALLOC | FIRST | LAST |  |  |  |  |  |
| 1           | TEX  | 0       | 0      | 0     | 0    | 0         | 1     | 0    |  |  |  |  |  |

Picked up by the TP and returns:

| Execu | ALU | U |
|-------|-----|---|
|       |     |   |
|       |     |   |

| State Bits |     |    |     |    |      |     |    |      |     |
|------------|-----|----|-----|----|------|-----|----|------|-----|
| CFP        | ECM | LI | CRP | РВ | EXID | GPR | EB | CPTR | LOD |
| 1          | 0   | 0  | 0   | 0  | 0    | 0   | 0  | 0    | 0   |

Status Bits

| Otatao Dito |      |         |   |       |      |           |       |      |  |  |  |  |
|-------------|------|---------|---|-------|------|-----------|-------|------|--|--|--|--|
| VALID       | TYPE | PENDING |   | ALLOC | SIZE | POS_ALLOC | FIRST | LAST |  |  |  |  |
| 1           | ALU  | 1       | 0 | 0     | 0    | 0         | 1     | 0    |  |  |  |  |

Picked up by the ALU and returns (lets say the TP has not returned yet):

Alloc Position 1

| State Bi | its |    |     |    |      |     |    |      |     |
|----------|-----|----|-----|----|------|-----|----|------|-----|
| CFP      | ECM | LI | CRP | PB | EXID | GPR | EB | CPTR | LOD |
| 2        | 0   | 0  | 0   | 0  | 0    | 0   | 0  | 0    | 0   |



EDIT DATE
4 September, 20152

DOCUMENT-REV. NUM. GEN-CXXXXX-REVA PAGE 53 of 54

Status Bits

|       |      |         |        | ATTENNESS TO THE PARTY OF THE P |      |           |       |      |
|-------|------|---------|--------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------|-----------|-------|------|
| VALID | TYPE | PENDING | SERIAL | ALLOC                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | SIZE | POS_ALLOC | FIRST | LAST |
| 1     | ALU  | 1       | 0      | 01                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             | 1    | 0         | 1     | 0    |

If the SX has the place for the export, the SQ is going to allocate and pick up the thread for execution. It returns to the RS in this state:

Execute Alu 0 Tex 0

| State B | its |    |     |    |      |     | WWW. |      |     |
|---------|-----|----|-----|----|------|-----|------|------|-----|
| CFP     | ECM | LI | CRP | PB | EXID | GPR | EB   | CPTR | LOD |
| 3       | 1   | 0  | 0   | 0  | 0    | 0   | 0    | 0    | 0   |

| Status Bi | te   |         | 700000000000000000000000000000000000000 |       |      |           |       |      |
|-----------|------|---------|-----------------------------------------|-------|------|-----------|-------|------|
| VALID     | TYPE | PENDING | SERIAL                                  | ALLOC | SIZE | POS_ALLOC | FIRST | LAST |
| 1         | TEX  | 1       | 0                                       | 0     | 0    | 1         | 1     | 0    |

Now, since the TP has not returned yet, we must wait for it to return because we cannot issue multiple texture requests. The TP returns, clears the PENDING bit and we proceed:

Alloc Param 3

| State Bits | S   |    |     |    |      |     |    |      |     |
|------------|-----|----|-----|----|------|-----|----|------|-----|
| CFP        | ECM | LI | CRP | PB | EXID | GPR | EB | CPTR | LOD |
| 4          | 0   | 0  | 0   | 0  | 1    | 0   | 0  | 0    | 0   |

| Status Bits |      |         |        |       |      |           |       |      |
|-------------|------|---------|--------|-------|------|-----------|-------|------|
| VALID       | TYPE | PENDING | SERIAL | ALLOC | SIZE | POS_ALLOC | FIRST | LAST |
| 1           | ALU  | 1       | 0      | 10    | 3    | 1         | 1     | 0    |

Once again the SQ makes sure the SX has enough room in the Parameter cache before it can pick up this thread.

Execute Alu 0 Tex 0 Alu 1 Alu 0 End

| State Bits |     |    |     |    |      |     |     |      |     |  |  |
|------------|-----|----|-----|----|------|-----|-----|------|-----|--|--|
| CFP        | ECM | LI | CRP | РВ | EXID | GPR | EB  | CPTR | LOD |  |  |
| 5          | 1   | 0  | 0   | 0  | 1    | 0   | 100 | 0    | 0   |  |  |

| Status Bits |      |         |        |       |      |           |       |      |
|-------------|------|---------|--------|-------|------|-----------|-------|------|
| VALID       | TYPE | PENDING | SERIAL | ALLOC | SIZE | POS_ALLOC | FIRST | LAST |
| 1           | TEX  | 1       | 0      | 0     | 0    | 1         | 1     | 0    |

This executes on the TP and then returns:

| State Bits |     |    |     |    |      |     |     |      |     |
|------------|-----|----|-----|----|------|-----|-----|------|-----|
| CFP        | ECM | LI | CRP | РВ | EXID | GPR | EB  | CPTR | LOD |
| 5          | 2   | 0  | 0   | 0  | 1    | 0   | 100 | 0    | 0   |

#### Status Bits

| Æ     |              | ORIGINATE DATE<br>24 September, 2001 |        | EDIT DATE  4 September, 20152  May 200319 April |      | R400 Sequencer Specification |       |      |   |
|-------|--------------|--------------------------------------|--------|-------------------------------------------------|------|------------------------------|-------|------|---|
| VALID | TYPE PENDING |                                      | SERIAL | ALLOC                                           | SIZE | POS_ALLOC                    | FIRST | LAST |   |
| 1     | ALU          | 1                                    | 1      | 0                                               | 0    | 1                            | 1     | 1    | 1 |

Waits for the TP to return because of the textures reads are pending (and SERIAL in this case). Then executes and does not return to the RS because the LAST bit is set. This is the end of this thread and before dropping it on the floor, the SQ notifies the SX of export completion.

### 25. Open issues

Need to do some testing on the size of the register file as well as on the register file allocation method (dynamic VS static).

Saving power?



EDIT DATE
4 September, 201543

DOCUMENT-REV. NUM. GEN-CXXXXX-REVA

PAGE 1 of 53

| u |  |  |  |
|---|--|--|--|
|   |  |  |  |
|   |  |  |  |

Laurent Lefebvre

| Issue To: | Copy No: |
|-----------|----------|
|           |          |

# **R400 Sequencer Specification**

## SQ

## Version 2.024

**Overview:** This is an architectural specification for the R400 Sequencer block (SEQ). It provides an overview of the required capabilities and expected uses of the block. It also describes the block interfaces, internal subblocks, and provides internal state diagrams.

#### **AUTOMATICALLY UPDATED FIELDS:**

**Document Location:** C:\perforce\r400\doc\_lib\design\blocks\sq\R400\_Sequencer.doc

Current Intranet Search Title: R400 Sequencer Specification

| AP        | PROVALS        |
|-----------|----------------|
| Name/Dept | Signature/Date |
|           |                |
|           |                |
|           |                |
|           |                |
| Remarks:  |                |

THIS DOCUMENT CONTAINS CONFIDENTIAL INFORMATION THAT COULD BE SUBSTANTIALLY DETRIMENTAL TO THE INTEREST OF ATI TECHNOLOGIES INC. THROUGH UNAUTHORIZED USE OR DISCLOSURE.

"Copyright 2001, ATI Technologies Inc. All rights reserved. The material in this document constitutes an unpublished work created in 2001. The use of this copyright notice is intended to provide notice that ATI owns a copyright in this unpublished work. The copyright notice is not an admission that publication has occurred. This work contains confidential, proprietary information and trade secrets of ATI. No part of this document may be used, reproduced, or transmitted in any form or by any means without the prior written permission of ATI Technologies Inc."

Exhibit 2030 docR400\_Sequencer.doc 74578 Bytes\*\*\* © ATI Confidential. Reference Copyright Notice on Cover Page © \*\*\*

ATI 2030 LG v. ATI IPR2015-00325

AMD1044\_0257449

2 of 53

## Table Of Contents

| 1.         | OVERVIEW                                       |                                         |
|------------|------------------------------------------------|-----------------------------------------|
| 1.1        | Top Level Block Diagram                        | 9                                       |
| 1.2        | Data Flow graph (SP)                           |                                         |
| 1.3        | Control Graph                                  |                                         |
| 2.         | INTERPOLATED DATA BUS                          |                                         |
| 3.         | INSTRUCTION STORE                              |                                         |
| 4.         | SEQUENCER INSTRUCTIONS                         |                                         |
| 5.         | CONSTANT STORES                                |                                         |
| 5.1        | Memory organizations                           |                                         |
| 5.2        | Management of the Control Flow Constants       |                                         |
| 5.3        | Management of the re-mapping tables            |                                         |
| 5.3        | 3                                              |                                         |
| 5.3        | ·                                              | 15                                      |
| 5.3        | 3.3 Dirty bits                                 | 17                                      |
| 5.3        | 3.4 Free List Block                            | 17                                      |
| 5.3        | 3.5 De-allocate Block                          | 18                                      |
| 5.3        |                                                |                                         |
| 5.4        | Constant Store Indexing                        |                                         |
| 5.5        | Real Time Commands                             |                                         |
| 5.6        | Constant Waterfalling                          |                                         |
| 6.         | LOOPING AND BRANCHES                           |                                         |
| 6.1        | The controlling state.                         |                                         |
| 6.2        | The Control Flow Program                       |                                         |
| 6.2        | 2.1 Control flow instructions table            |                                         |
| 6.3        | Implementation                                 |                                         |
| 6.4        | Data dependant predicate instructions          |                                         |
| 6.5        | HW Detection of PV,PS                          | DOMESTICATION                           |
| 6.6        | Register file indexing                         | *************************************** |
| 6.7        | Debugging the Shaders                          |                                         |
|            | 7.1 Method 1: Debugging registers              | **********                              |
|            | 7.2 Method 2: Exporting the values in the GPRs |                                         |
| 7.         | PIXEL KILL MASK                                |                                         |
| 8.         | MULTIPASS VERTEX SHADERS (HOS)                 |                                         |
| 9.         | REGISTER FILE ALLOCATION                       |                                         |
| 10.        | FETCH ARBITRATIONALU ARBITRATION               |                                         |
| 11.<br>12. | HANDLING STALLS                                |                                         |
| 12.<br>13. | CONTENT OF THE RESERVATION STATION FIFOS       |                                         |
| 13.<br>14. | THE OUTPUT FILE                                |                                         |
| 15.        | IJ FORMAT                                      |                                         |
| 15.1       | Interpolation of constant attributes           |                                         |
| 16         | STAGING REGISTERS                              | 3029                                    |

|                         | ORIGINATE DATE                                                 | EDIT DATE              | DOCUMENT-REV. NUM.                      | PAGE                                                                                                                                                                                                                         |  |  |  |  |
|-------------------------|----------------------------------------------------------------|------------------------|-----------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--|--|--|--|
| <b>600</b>              | 24 September, 2001                                             | 4 September, 201513    | GEN-CXXXXX-REVA                         | 3 of 53                                                                                                                                                                                                                      |  |  |  |  |
| 17. THE PARAMETER CACHE |                                                                |                        |                                         |                                                                                                                                                                                                                              |  |  |  |  |
|                         | ort restrictions                                               |                        |                                         | 333130                                                                                                                                                                                                                       |  |  |  |  |
| 17.1.1                  | •                                                              |                        |                                         |                                                                                                                                                                                                                              |  |  |  |  |
| 17.1.2                  |                                                                |                        |                                         |                                                                                                                                                                                                                              |  |  |  |  |
| 17.1.3                  |                                                                |                        |                                         |                                                                                                                                                                                                                              |  |  |  |  |
|                         |                                                                | COMMODOMACHIMINOCOMMIN |                                         |                                                                                                                                                                                                                              |  |  |  |  |
|                         | 8. EXPORT TYPES                                                |                        |                                         |                                                                                                                                                                                                                              |  |  |  |  |
|                         | <ul><li>8.1 Vertex Shading</li><li>8.2 Pixel Shading</li></ul> |                        |                                         |                                                                                                                                                                                                                              |  |  |  |  |
|                         |                                                                |                        |                                         |                                                                                                                                                                                                                              |  |  |  |  |
|                         |                                                                |                        |                                         |                                                                                                                                                                                                                              |  |  |  |  |
|                         |                                                                |                        |                                         |                                                                                                                                                                                                                              |  |  |  |  |
|                         | -                                                              |                        |                                         |                                                                                                                                                                                                                              |  |  |  |  |
|                         |                                                                |                        |                                         |                                                                                                                                                                                                                              |  |  |  |  |
|                         |                                                                |                        |                                         |                                                                                                                                                                                                                              |  |  |  |  |
|                         |                                                                |                        |                                         | CHAPMINISTER CONTROL OF THE PARTY OF T |  |  |  |  |
|                         |                                                                |                        |                                         |                                                                                                                                                                                                                              |  |  |  |  |
|                         |                                                                |                        |                                         |                                                                                                                                                                                                                              |  |  |  |  |
|                         | •                                                              |                        |                                         |                                                                                                                                                                                                                              |  |  |  |  |
|                         |                                                                |                        |                                         | SERVINENNAMENANTONING                                                                                                                                                                                                        |  |  |  |  |
|                         |                                                                |                        |                                         |                                                                                                                                                                                                                              |  |  |  |  |
|                         |                                                                |                        |                                         | ***************************************                                                                                                                                                                                      |  |  |  |  |
|                         |                                                                |                        |                                         | ORIGINAL DESIGNATION AND STREET                                                                                                                                                                                              |  |  |  |  |
| -                       |                                                                |                        | *************************************** |                                                                                                                                                                                                                              |  |  |  |  |
|                         |                                                                |                        |                                         | 200000000000000000000000000000000000000                                                                                                                                                                                      |  |  |  |  |
|                         |                                                                |                        |                                         |                                                                                                                                                                                                                              |  |  |  |  |
| 24.2.1                  |                                                                |                        |                                         |                                                                                                                                                                                                                              |  |  |  |  |
| 24.2.2                  | 70000                                                          |                        |                                         |                                                                                                                                                                                                                              |  |  |  |  |
| 24.2.3                  | Desiration                                                     |                        |                                         |                                                                                                                                                                                                                              |  |  |  |  |
| 24.2.4                  | ,                                                              |                        |                                         |                                                                                                                                                                                                                              |  |  |  |  |
| 24.2.5                  |                                                                | _                      |                                         |                                                                                                                                                                                                                              |  |  |  |  |
| 24.2.6                  |                                                                |                        |                                         |                                                                                                                                                                                                                              |  |  |  |  |
| 24.2.7                  |                                                                |                        |                                         |                                                                                                                                                                                                                              |  |  |  |  |
|                         |                                                                |                        |                                         |                                                                                                                                                                                                                              |  |  |  |  |
| 24.2.8                  |                                                                |                        |                                         |                                                                                                                                                                                                                              |  |  |  |  |
| 24.2.9                  |                                                                |                        |                                         |                                                                                                                                                                                                                              |  |  |  |  |
| 24.2.10                 | SQ to SP: Texture sta                                          | 464442                 |                                         |                                                                                                                                                                                                                              |  |  |  |  |
| 24.2.11                 | SQ to SP: GPR and a                                            |                        | 464443                                  |                                                                                                                                                                                                                              |  |  |  |  |
| 24.2.12                 | SQ to SPx: Instruction                                         |                        | 474544                                  |                                                                                                                                                                                                                              |  |  |  |  |
| 24.2.13                 | SP to SQ: Constant a                                           | ddress load/ Predicate | Set                                     | 474544                                                                                                                                                                                                                       |  |  |  |  |
| 24.2.14                 | SQ to SPx: constant t                                          | oroadcast              |                                         | 484645                                                                                                                                                                                                                       |  |  |  |  |
|                         |                                                                |                        |                                         |                                                                                                                                                                                                                              |  |  |  |  |

|                 | AP                                             | ORIGINATE DATE             | EDIT DATE           | R400 Sequencer Specification | PAGE    |  |  |  |  |
|-----------------|------------------------------------------------|----------------------------|---------------------|------------------------------|---------|--|--|--|--|
|                 | <b>/ 1 1 1 1</b>                               | 24 September, 2001         | 4 September, 201513 |                              | 4 of 53 |  |  |  |  |
|                 | 24.2.15 SP0 to SQ: Kill vector load            |                            |                     |                              |         |  |  |  |  |
| -               | 24.2.16                                        | 24.2.16 SQ to CP: RBBM bus |                     |                              |         |  |  |  |  |
| -               | 24.2.17                                        | 4.2.17 CP to SQ: RBBM bus  |                     |                              |         |  |  |  |  |
|                 | 24.2.18 SQ to CP: State report                 |                            |                     |                              |         |  |  |  |  |
|                 | 24.3 Example of control flow program execution |                            |                     |                              |         |  |  |  |  |
| 25. OPEN ISSUES |                                                |                            |                     |                              |         |  |  |  |  |



**EDIT DATE** 4 September, 201513

PAGE 5 of 53

DOCUMENT-REV. NUM. GEN-CXXXXX-REVA

## Revision Changes:

Rev 0.1 (Laurent Lefebvre)

Date: May 7, 2001

Rev 0.2 (Laurent Lefebvre)

Date: July 9, 2001

Rev 0.3 (Laurent Lefebvre) Date: August 6, 2001 Rev 0.4 (Laurent Lefebvre) Date: August 24, 2001

Rev 0.5 (Laurent Lefebvre) Date: September 7, 2001 Rev 0.6 (Laurent Lefebvre) Date: September 24, 2001 Rev 0.7 (Laurent Lefebvre) Date: October 5, 2001

Rev 0.8 (Laurent Lefebvre) Date: October 8, 2001 Rev 0.9 (Laurent Lefebvre) Date: October 17, 2001

Rev 1.0 (Laurent Lefebvre) Date: October 19, 2001 Rev 1.1 (Laurent Lefebvre) Date: October 26, 2001

Rev 1.2 (Laurent Lefebvre) Date: November 16, 2001 Rev 1.3 (Laurent Lefebvre) Date: November 26, 2001 Rev 1.4 (Laurent Lefebvre) Date: December 6, 2001

Rev 1.5 (Laurent Lefebvre) Date: December 11, 2001

Rev 1.6 (Laurent Lefebvre) Date: January 7, 2002

Rev 1.7 (Laurent Lefebvre) Date: February 4, 2002 Rev 1.8 (Laurent Lefebvre) Date: March 4, 2002

Rev 1.9 (Laurent Lefebvre) Date: March 18, 2002 Rev 1.10 (Laurent Lefebvre) Date: March 25, 2002 Rev 1.11 (Laurent Lefebvre) Date: April 19, 2002 Rev 2.0 (Laurent Lefebvre)

Date: April 19, 2002

First draft.

Changed the interfaces to reflect the changes in the SP. Added some details in the arbitration section. Reviewed the Sequencer spec after the meeting on August 3, 2001.

Added the dynamic allocation method for register file and an example (written in part by Vic) of the flow of pixels/vertices in the sequencer. Added timing diagrams (Vic)

Changed the spec to reflect the new R400 architecture. Added interfaces.

Added constant store management, instruction store management, control flow management and data dependant predication.

Changed the control flow method to be more flexible. Also updated the external interfaces.

Incorporated changes made in the 10/18/01 control flow meeting. Added a NOP instruction, removed the conditional execute or jump. Added debug registers.

Refined interfaces to RB. Added state registers.

Added SEQ→SP0 interfaces. Changed delta precision. Changed VGT-SP0 interface. Debug Methods added.

Interfaces greatly refined. Cleaned up the spec.

Added the different interpolation modes.

Added the auto incrementing counters. Changed the VGT→SQ interface. Added content on constant management. Updated GPRs.

Removed from the spec all interfaces that weren't directly tied to the SQ. Added explanations on management. Added synchronization fields and explanation.

Added more details on the staging register. Added detail about the parameter caches. Changed the call instruction to a Conditionnal call instruction. Added details on constant management and updated the diagram.

Added Real Time parameter control in the SX interface. Updated the control flow section.

New interfaces to the SX block. Added the end of clause modifier, removed the end of clause instructions.

Rearangement of the CF instruction bits in order to ensure byte alignement.

Updated the interfaces and added a section on exporting rules.

Added CP state report interface. Last version of the spec with the old control flow scheme

New control flow scheme



**EDIT DATE** 4 September, 201543 R400 Sequencer Specification

PAGE

6 of 53

Rev 2.01 (Laurent Lefebvre) Date: May 2, 2002 Rev 2.02 (Laurent Lefebvre) Date: May 13, 2002

Changed slightly the control flow instructions to allow force jumps and calls. Updated the Opcodes. Added type field to the constant/pred interface. Added Last field to the SQ→SP instruction load interface.



EDIT DATE
4 September, 201513

DOCUMENT-REV. NUM. GEN-CXXXXX-REVA

PAGE 7 of 53

## 1. Overview

The sequencer chooses two ALU threads and a fetch hread to execute, and executes all of the instructions in a block before looking for a new clause of the same type. Two ALU threads are executed interleaved to hide the ALU latency. The arbitrator will give priority to older threads. There are two separate reservation stations, one for pixel vectors and one for vertices vectors. This way a pixel can pass a vertex and a vertex can pass a pixel.

To support the shader pipe the sequencer also contains the shader instruction cache, constant store, control flow constants and texture state. The four shader pipes also execute the same instruction thus there is only one sequencer for the whole chip.

The sequencer first arbitrates between vectors of 64 vertices that arrive directly from primitive assembly and vectors of 16 quads (64 pixels) that are generated in the scan converter.

The vertex or pixel program specifies how many GPRs it needs to execute. The sequencer will not start the next vector until the needed space is available in the GPRs.



## 1.1 Top Level Block Diagram



Figure 2: Reservation stations and arbiters

Under this new scheme, the sequencer (SQ) will only use one global state management machine per vector type (pixel, vertex) that we call the reservation station (RS).



# 1.2 Data Flow graph (SP)



Figure 3: The shader Pipe

EDIT DATE

4 September, 201543

DOCUMENT-REV. NUM. GEN-CXXXXX-REVA PAGE 11 of 53

The gray area represents blocks that are replicated 4 times per shader pipe (16 times on the overall chip).

## 1.3 Control Graph



Figure 4: Sequencer Control interfaces

In green is represented the Fetch control interface, in red the ALU control interface, in blue the Interpolated/Vector control interface and in purple is the output file control interface.

## 2. Interpolated data bus

The interpolators contain an IJ buffer to pack the information as much as possible before writing it to the register file.



Figure 5: Interpolation buffers

PROTECTIVE ORDER MATERIAL

|                    |                    | T23                                                     | G0000000000000000000000000000000000000  |             |               |            | > 48-<br>51             | V<br>52-<br>55         | > 56-<br>59             | > 000                  |   |  |
|--------------------|--------------------|---------------------------------------------------------|-----------------------------------------|-------------|---------------|------------|-------------------------|------------------------|-------------------------|------------------------|---|--|
|                    |                    | T22                                                     |                                         |             |               |            | V V<br>32- 48-<br>35 51 | V V<br>36-52-<br>39 55 | V V<br>40- 56-<br>43 59 | V V<br>44-60-<br>47 63 | X |  |
|                    |                    | T21                                                     |                                         |             |               |            | > 16-<br>19             | 20-<br>23              | > 24-<br>27             | 28-<br>31              |   |  |
|                    |                    | T20                                                     |                                         |             |               |            | V<br>0-3                | > 2-4                  | > % =                   | > 2 5                  |   |  |
|                    |                    | T19                                                     |                                         |             |               |            |                         |                        | E0                      | <u> </u>               |   |  |
|                    |                    | T18                                                     |                                         |             |               |            |                         |                        |                         | 8                      |   |  |
| li iii             | 53                 | T17                                                     |                                         |             | ≿ 🗅           | <u>≯</u> ⊒ |                         | 80                     | 2                       | 22                     |   |  |
| PAGE               | 13 of 53           | T16                                                     |                                         |             | <u></u>       | <u> </u>   |                         |                        |                         | 8                      | 2 |  |
|                    | -                  | T15                                                     | *************************************** |             | 9             | ፲          | D1                      | D2                     |                         |                        |   |  |
| UM.                | VA                 | T10 T11 T12 T13 T14 T15 T16 T17 T18 T19 T20 T21 T22 T23 | ≱ב                                      | <u></u> ≿ 2 |               | ≿8         | $\Im$                   | 2                      | C5                      |                        |   |  |
| ZEV. P             | X-RE               | T13                                                     | 5                                       | D2          |               | 20         | <b>B</b>                |                        |                         |                        |   |  |
| DOCUMENT-REV. NUM. | GEN-CXXXXX-REVA    | T12                                                     | 2                                       | D2          |               | 20         | AO                      | P4                     | A2                      |                        |   |  |
| OCCU               | GEN-(              | T11                                                     | <b>P</b>                                | ≯2          | > 55          | סם ע.      |                         |                        | <u>a</u>                | П                      |   |  |
|                    |                    | T10                                                     | •                                       | 2           | C5            | ם          |                         |                        |                         | 8                      |   |  |
|                    | <br>TIS            | 6L                                                      | Petroeliuses                            | 2           | C5            |            |                         | 8                      | $\mathcal{D}$           | 22                     |   |  |
| 11                 | 2015               | Z                                                       | ≿ წ                                     | ≿ მ         | ≿ઇ            | ≿ 3        |                         |                        |                         | 80                     | · |  |
| EDIT DATE          | ember              | <b>T7</b>                                               | S                                       | 8           | $\mathcal{D}$ | C2         | D1                      | D2                     |                         |                        | 1 |  |
|                    | Septe              | 4 September, 201543                                     | 9L                                      | ES          | 8             | 5          | C2                      | င်ဒ                    | 2                       | C5                     |   |  |
|                    | 4 2                | T5                                                      | <u>≯</u> 22                             |             |               | ≿ 8        | B1                      |                        |                         |                        |   |  |
| ATE                | 2001               | 4                                                       | <u>8</u>                                |             |               | 8          | AO                      | A1                     | A2                      |                        |   |  |
| ORIGINATE DATE     | 24 September, 2001 | T3                                                      | 20                                      |             |               | 8          | X≺<br>48-<br>51         | XY<br>52-<br>55        | ×≺<br>56-<br>59         | ×<br>69<br>83          |   |  |
| RIGIN              | Septe              | T2                                                      | × 8                                     | <u></u>     | \$ ₹          |            | XY<br>32-<br>35         | 36 ₹                   | × 4 4 8                 | × 4 4 ×                | > |  |
| ō                  | 24                 | E                                                       | A0                                      | A<br>A      | A2            |            | × 4-61                  | 3,5,₹                  | ×× 24-                  | 3. ₹                   | × |  |
| 00                 |                    | 10                                                      | Ao                                      | A1          | A2            |            | XY<br>0-3               | XX 4-7                 | <b>≯</b> % <del>∠</del> | <u></u>                |   |  |
|                    | 3                  |                                                         | SP<br>0                                 | SP +        | SP<br>2       | S<br>S     | SP<br>0                 | SP +                   | SP<br>2                 | SP<br>3                |   |  |

Figure 6: Interpolation timing diagram



EDIT DATE
4 September, 201513

R400 Sequencer Specification

PAGE 14 of 53

Above is an example of a tile the sequencer might receive from the SC. The write side is how the data get stacked into the XY and IJ buffers, the read side is how the data is passed to the GPRs. The IJ information is packed in the IJ buffer 4 quads at a time or two clocks. The sequencer allows at any given time as many as four quads to interpolate a parameter. They all have to come from the same primitive. Then the sequencer controls the write mask to the GPRs to write the valid data in.

## 3. Instruction Store

There is going to be only one instruction store for the whole chip. It will contain 4096 instructions of 96 bits each.

It is likely to be a 1 port memory; we use 1 clock to load the ALU instruction, 1 clocks to load the Fetch instruction, 1 clock to load 2 control flow instructions and 1 clock to write instructions.

The instruction store is loaded by the CP thru the register mapped registers.

The VS\_BASE and PS\_BASE context registers are used to specify for each context where its shader is in the instruction memory.

For the Real time commands the story is quite the same but for some small differences. There are no wrap-around points for real time so the driver must be careful not to overwrite regular shader data. The shared code (shared subroutines) uses the same path as real time.

## 4. Sequencer Instructions

All control flow instructions and move instructions are handled by the sequencer only. The ALUs will perform NOPs during this time (MOV PV,PV, PS,PS) if they have nothing else to do.

## 5. Constant Stores

## 5.1 Memory organizations

A likely size for the ALU constant store is 1024x128 bits. The read BW from the ALU constant store is 128 bits/clock and the write bandwidth is 32 bits/clock (directed by the CP bus size not by memory ports).

The maximum logical size of the constant store for a given shader is 256 constants. Or 512 for the pixel/vertex shader pair. The size of the re-mapping table is 128 lines (each line addresses 4 constants). The write granularity is 4 constants or 512 bits. It takes 16 clocks to write the four constants. Real time requires 256 lines in the physical memory (this is physically register mapped).

The texture state is also kept in a similar memory. The size of this memory is 320x96 bits (128 texture states for regular mode, 32 states for RT). The memory thus holds 128 texture states (192 bits per state). The logical size exposes 32 different states total, which are going to be shared between the pixel and the vertex shader. The size of the re-mapping table to for the texture state memory is 32 lines (each line addresses 1 texture state lines in the real memory). The CP write granularity is 1 texture state lines (or 192 bits). The driver sends 512 bits but the CP ignores the top 320 bits. It thus takes 6 clocks to write the texture state. Real time requires 32 lines in the physical memory (this is physically register mapped).

The control flow constant memory doesn't sit behind a renaming table. It is register mapped and thus the driver must reload its content each time there is a change in the control flow constants. Its size is 320\*32 because it must hold 8 copies of the 32 dwords of control flow constants and the loop construct constants must be aligned.

The constant re-mapping tables for texture state and ALU constants are logically register mapped for regular mode and physically register mapped for RT operation.

Exhibit 2030.docR400\_Sequencer.doc 74578 Bytes\*\*\* ATI Confidential. Reference Copyright Notice on Cover Page \*\*\*

AMD1044\_0257462



EDIT DATE
4 September, 2015<del>13</del>

DOCUMENT-REV. NUM. GEN-CXXXXX-REVA

PAGE 15 of 53

## 5.2 Management of the Control Flow Constants

The control flow constants are register mapped, thus the CP writes to the according register to set the constant, the SQ decodes the address and writes to the block pointed by its current base pointer (CF\_WR\_BASE). On the read side, one level of indirection is used. A register (SQ\_CONTEXT\_MISC.CF\_RD\_BASE) keeps the current base pointer to the control flow block. This register is copied whenever there is a state change. Should the CP write to CF after the state change, the base register is updated with the (current pointer number +1)% number of states. This way, if the CP doesn't write to CF the state is going to use the previous CF constants.

## 5.3 Management of the re-mapping tables

## 5.3.1 R400 Constant management

The sequencer is responsible to manage two re-mapping tables (one for the constant store and one for the texture state). On a state change (by the driver), the sequencer will broadside copy the contents of its re-mapping tables to a new one. We have 8 different re-mapping tables we can use concurrently.

The constant memory update will be incremental, the driver only need to update the constants that actually changed between the two state changes.

For this model to work in its simplest form, the requirement is that the physical memory MUST be at least twice as large as the logical address space + the space allocated for Real Time. In our case, since the logical address space is 512 and the reserved RT space can be up to 256 entries, the memory must be of sizes 1280 and above. Similarly the size of the texture store must be of 32\*2+32 = 96 entries and above.

## 5.3.2 Proposal for R400LE constant management

To make this scheme work with only 512+256 = 768 entries, upon reception of a CONTROL packet of state + 1, the sequencer would check for SQ\_IDLE and PA\_IDLE and if both are idle will erase the content of state to replace it with the new state (this is depicted in <u>Figure 8: De-allocation mechanismFigure 8: De-allocation mechanismFigure 8: De-allocation mechanism</u>). Note that in the case a state is cleared a value of 0 is written to the corresponding de-allocation counter location so that when the SQ is going to report a state change, nothing will be de-allocated upon the first report.

The second path sets all context dirty bits that were used in the current state to 1 (thus allowing the new state to reuse these physical addresses if needed).

Exhibit 2030.docR400\_Sequencer.doc





Figure 7: Constant management

74578 Bytes\*\*\* © ATI Confidential. Reference Copyright Notice on Cover Page © \*\*\*

EDIT DATE

4 September, 201513

DOCUMENT-REV. NUM. GEN-CXXXXX-REVA

PAGE 17 of 53



Figure 8: De-allocation mechanism for R400LE

## 5.3.3 Dirty bits

Two sets of dirty bits will be maintained per logical address. The first one will be set to zero on reset and set when the logical address is addressed. The second one will be set to zero whenever a new context is written and set for each address written while in this context. The reset dirty is not set, then writing to that logical address will not require de-allocation of whatever address stored in the renaming table. If it is set and the context dirty is not set, then the physical address store needs to be de-allocated and a new physical address is necessary to store the incoming data. If they are both set, then the data will be written into the physical address held in the renaming for the current logical address. No de-allocation or allocation takes place. This will happen when the driver does a set constant twice to the same logical address between context changes. NOTE: It is important to detect and prevent this, failure to do it will allow multiple writes to allocate all physical memory and thus hang because a context will not fit for rendering to start and thus free up space.

#### 5.3.4 Free List Block

A free list block that would consist of a counter (called the IFC or Initial Free Counter) that would reset to zero and incremented every time a chunk of physical memory is used until they have all been used once. This counter would be checked each time a physical block is needed, and if the original ones have not been used up, us a new one, else check the free list for an available physical block address. The count is the physical address for when getting a chunk from the counter.

Storage of a free list big enough to store all physical block addresses.

Maintain three pointers for the free list that are reset to zero. The first one we will call write\_ptr. This pointer will identify the next location to write the physical address of a block to be de-allocated. Note: we can never free more physical memory locations than we have. Once recording address the pointer will be incremented to walk the free list like a ring.

The second pointer will be called stop\_ptr. The stop\_ptr pointer will be advanced by the number of address chunks de-allocates when a context finishes. The address between the stop\_ptr and write\_ptr cannot be reused because they are still in use. But as soon as the context using then is dismissed the stop\_ptr will be advanced.

The third pointer will be called read\_ptr. This pointer will point will point to the next address that can be used for allocation as long as the read\_ptr does not equal the stop\_ptr and the IFC is at its maximum count.



**EDIT DATE** 4 September, 201543 R400 Sequencer Specification

PAGE 18 of 53

## 5.3.5 De-allocate Block

This block will maintain a free physical address block count for each context. While in current context, a count shall be maintained specifying how many blocks were written into the free list at the write\_ptr pointer. This count will be reset upon reset or when this context is active on the back and different than the previous context. It is actually a count of blocks in the previous context that will no longer be used. This count will be used to advance the write\_ptr pointer to make available the set of physical blocks freed when the previous context was done. This allows the discard or de-allocation of any number of blocks in one clock.

## 5.3.6 Operation of Incremental model

The basic operation of the model would start with the write ptr, stop ptr, read ptr pointers in the free list set to zero and the free list counter is set to zero. Also all the dirty bits and the previous context will be initialized to zero. When the first set constants happen, the reset dirty bit will not be set, so we will allocate a physical location from the free list counter because its not at the max value. The data will be written into physical address zero. Both the additional copy of the renaming table and the context zeros of the big renaming table will be updated for the logical address that was written by set start with physical address of 0. This process will be repeated for any logical address that are not dirty until the context changes. If a logical address is hit that has its dirty bits set while in the same context, both dirty bits would be set, so the new data will be over-written to the last physical address assigned for this logical address. When the first draw command of the context is detected, the previous context stored in the additional renaming table will be copied to the larger renaming table in the current (new) context location. Then the set constant logical address with be loaded with a new physical address during the copy and if the reset dirty was set, the physical address it replaced in the renaming table would be entered at the write\_ptr pointer location on the free list and the write ptr will be incremented. The de-allocation counter for the previous context (eight) will be incremented. This as set states come in for this context one of the following will happen:

- 1.) No dirty bits are set for the logical address being updated. A line will be allocated of the free-list counter or the free list at read ptr pointer if read ptr != to stop ptr .
- 2.) Reset dirty set and Context dirty not set. A new physical address is allocated, the physical address in the renaming table is put on the free list at write ptr and it is incremented along with the de-allocate counter for the last context.
- 3.) Context dirty is set then the data will be written into the physical address specified by the logical address.

This process will continue as long as set states arrive. This block will provide backpressure to the CP whenever he has not free list entries available (counter at max and stop\_ptr == read\_ptr). The command stream will keep a count of contexts of constants in use and prevent more than max constants contexts from being sent.

Whenever a draw packet arrives, the content of the re-mapping table is written to the correct re-mapping table for the context number. Also if the next context uses less constants than the current one all exceeding lines are moved to the free list to be de-allocated later. This happens in parallel with the writing of the re-mapping table to the correct memory.

Now preferable when the constant context leaves the last ALU clause it will be sent to this block and compared with the previous context that left. (Init to zero) If they differ than the older context will no longer be referenced and thus can be de-allocated in the physical memory. This is accomplished by adding the number of blocks freed this context to the stop\_ptr pointer. This will make all the physical addresses used by this context available to the read\_ptr allocate pointer for future allocation.

This device allows representation of multiple contexts of constants data with N copies of the logical address space. It also allows the second context to be represented as the first set plus some new additional data by just storing the delta's. It allows memory to be efficiently used and when the constants updates are small it can store multiple context. However, if the updates are large, less contexts will be stored and potentially performance will be degraded. Although it will still perform as well as a ring could in this case.

## 5.4 Constant Store Indexing

In order to do constant store indexing, the sequencer must be loaded first with the indexes (that come from the GPRs). There are 144 wires from the exit of the SP to the sequencer (9 bits pointers x 16 vertexes/clock). Since the data must pass thru the Shader pipe for the float to fixed conversion, there is a latency of 4 clocks (1 instruction)

Exhibit 2030. doc R400\_Sequencer.doc 74578 Bytes\*\*\* 

ATI Confidential. Reference Copyright Notice on Cover Page 

\*\*\*

AMD1044 0257466



EDIT DATE
4 September, 201513

DOCUMENT-REV. NUM.
GEN-CXXXXX-REVA

PAGE 19 of 53

between the time the sequencer is loaded and the time one can index into the constant store. The assembly will look like this

MOVA R1.X,R2.X // Loads the sequencer with the content of R2.X, also copies the content of R2.X into R1.X NOP // latency of the float to fixed conversion

ADD R3,R4,C0[R2.X]// Uses the state from the sequencer to add R4 to C0[R2.X] into R3

Note that we don't really care about what is in the brackets because we use the state from the MOVA instruction. R2.X is just written again for the sake of simplicity and coherency.

The storage needed in the sequencer in order to support this feature is 2\*64\*9 bits = 1152 bits.

#### 5.5 Real Time Commands

The real time commands constants are written by the CP using the register mapped registers allocated for RT. It works is the same way than when dealing with regular constant loads BUT in this case the CP is not sending a logical address but rather a physical address and the reads are not passing thru the re-mapping table but are directly read from the memory. The boundary between the two zones is defined by the CONST\_EO\_RT control register. Similarly, for the fetch state, the boundary between the two zones is defined by the TSTATE\_EO\_RT control register.

#### 5.6 Constant Waterfalling

In order to have a reasonable performance in the case of constant store indexing using the address register, we are going to have the possibility of using the physical memory port for read only. This way we can read 1 constant per clock and thus have a worst-case waterfall mode of 1 vertex per clock. There is a small synchronization issue related with this as we need for the SQ to make sure that the constants where actually written to memory (not only sent to the sequencer) before it can allow the first vector of pixels or vertices of the state to go thru the ALUs. To do so, the sequencer keeps 8 bits (one per render state) and sets the bits whenever the last render state is written to memory and clears the bit whenever a state is freed.



Figure 9: The instruction store

Exhibit 2030 docR400\_Sequencer.doc 74578 Bytes\*\*\* © ATI Confidential. Reference Copyright Notice on Cover Page © \*\*\*



EDIT DATE
4 September, 201543

R400 Sequencer Specification

PAGE 20 of 53

## 6. Looping and Branches

Loops and branches are planned to be supported and will have to be dealt with at the sequencer level. We plan on supporting constant loops and branches using a control program.

#### 6.1 The controlling state.

The R400 controling state consists of:

Boolean[256:0] Loop\_count[7:0][31:0] Loop\_Start[7:0][31:0] Loop\_Step[7:0][31:0]

That is 256 Booleans and 32 loops.

We have a stack of 4 elements for nested calls of subroutines and 4 loop counters to allow for nested loops.

This state is available on a per shader program basis.

#### 6.2 The Control Flow Program

We'd like to be able to code up a program of the form:

1: Loop 2: TexFetch Exec 3: TexFetch 4: ALU **ALU** TexFetch 6: 7: End Loop 8. ALU Export

But realize that 3: may be dependent on 2: and 4: is almost certainly dependent on 2: and 3:. Without clausing, these dependencies need to be expressed in the Control Flow instructions. Additionally, without separate 'texture clauses' and 'ALU clauses' we need to know which instructions to dispatch to the Texture Unit and which to the ALU unit. This information will be encapsulated in the flow control instructions.

Each control flow instruction will contain 2 bits of information for each (non-control flow) instruction:

- a) ALU or Texture
- b) Serialize Execution

(b) would force the thread to stop execution at this point (before the instruction is executed) and wait until all textures have been fetched. Given the allocation of reserved bits, this would mean that the count of an 'Exec' instruction would be limited to about 8 (non-control-flow) instructions. If more than this were needed, a second Exec (with the same conditions) would be issued.

Another function that relies upon 'clauses' is allocation and order of execution. We need to assure that pixels and vertices are exported in the correct order (even if not all execution is ordered) and that space in the output buffers are allocated in order. Additionally data can't be exported until space is allocated. A new control flow instruction:

Alloc <buffer select -- position,parameter, pixel or vertex memory. And the size required>.

would be created to mark where such allocation needs to be done. To assure allocation is done in order, the actual allocation for a given thread can not be performed unless the equivalent allocation for all previous threads is already completed. The implementation would also assure that execution of instruction(s) following the serialization due to the Alloc will occur in order -- at least until the next serialization or change from ALU to Texture. In most cases this will allow the exports to occur without any further synchronization. Only 'final' allocations or position allocations are

Exhibit 2030. doc R400\_Sequencer.doc 74578 Bytes\*\*\* © ATI Confidential. Reference Copyright Notice on Cover Page © \*\*\*



## ORIGINATE DATE

## **EDIT DATE**

DOCUMENT-REV. NUM.

**PAGE** 

24 September, 2001 4 September, 201513 GEN-CXXXXX-REVA

21 of 53

guaranteed to be ordered. Because strict ordering is required for pixels, parameters and positions, this implies only a single alloc for these structures. Vertex exports to memory do not require ordering during allocation and so multiple 'allocs' may be done.

#### 6.2.1 Control flow instructions table

Here is the revised control flow instruction set.

Note that whenever a field is marked as RESERVED, it is assumed that all the bits of the field are cleared (0).

| ¥     |            |          |  |  |  |  |  |
|-------|------------|----------|--|--|--|--|--|
|       | <u>NOP</u> |          |  |  |  |  |  |
| 47 44 | 43         | 42 0     |  |  |  |  |  |
| 0000  | Addressin  | RESERVED |  |  |  |  |  |
|       | g          |          |  |  |  |  |  |

This is a regular NOP.

|                    |                    |          | Execute                                        |       |              |
|--------------------|--------------------|----------|------------------------------------------------|-------|--------------|
| 47 4447            | 4346<br>43         | 40 34    | 33 16                                          | 1512  | 11 0         |
| 0001Addre<br>ssing | Address<br>ing0001 | RESERVED | Instructions type + serialize (9 instructions) | Count | Exec Address |
|                    | <del>,</del>       |          | Execute End                                    |       |              |
| 47 44              | 43                 | 40 34    | 33 16                                          | 1512  | 11 0         |
| 0010               | Address<br>ing     | RESERVED | Instructions type + serialize (9 instructions) | Count | Exec Address |

Execute up to 9 instructions at the specified address in the instruction memory. The Instruction type field tells the sequencer the type of the instruction (LSB) (1 = Texture, 0 = ALU and whether to serialize or not the execution (MSB) (1 = Serialize, 0 = Non-Serialized). If Execute End this is the last execution block of the shader program.

This is a regular NOP.

|                    | Conditional_Execute |           |                 |                                                |       |              |  |  |  |
|--------------------|---------------------|-----------|-----------------|------------------------------------------------|-------|--------------|--|--|--|
| <u>47 44</u> 47    | 4346<br>43          | 42        | 41 34           | 3316                                           | 15 12 | 11 0         |  |  |  |
| 0011Addre<br>ssing | Address<br>ing0011  | Condition | Boolean address | Instructions type + serialize (9 instructions) | Count | Exec Address |  |  |  |

| Conditional Execute End |                |           |                           |                                                |       |              |
|-------------------------|----------------|-----------|---------------------------|------------------------------------------------|-------|--------------|
| 47 44                   | 43             | 42        | 41 34                     | 3316                                           | 15 12 | <u>11 0</u>  |
| 0100                    | Address<br>ing | Condition | <u>Boolean</u><br>address | Instructions type + serialize (9 instructions) | Count | Exec Address |

If the specified Boolean (8 bits can address 256 Booleans) meets the specified condition then execute the specified instructions (up to 9 instructions). If the condition is not met, we go on to the next control flow instruction. If Conditional Execute End and the condition is met, this is the last execution block of the shader program.

|                 | Conditional_Execute_Predicates |           |          |           |                  |       |              |  |  |
|-----------------|--------------------------------|-----------|----------|-----------|------------------|-------|--------------|--|--|
| <u>47 44</u> 47 | <u>43</u> 46                   | 42        | 41 36    | 35 34     | 3316             | 1512  | 11 0         |  |  |
| 0101Addres      | 43<br>Addressi                 | Condition | RESERVED | Predicate | Instructions     | Count | Exec Address |  |  |
| sing            | <u>ng</u> 0010                 |           |          | vector    | type + serialize |       |              |  |  |
|                 |                                |           |          |           | (9 instructions) |       |              |  |  |

Exhibit 2030.docR400\_Sequencer.doc 74578 Bytes\*\*\* @ ATI Confidential. Reference Copyright Notice on Cover Page @ \*\*\*



## EDIT DATE 4 September, 201513

| R400 | Sequencer | Specification |
|------|-----------|---------------|
|      | ,         | ,             |

| Р  | ΑG | ЭE |
|----|----|----|
| 22 | of | 53 |

| - | Conditional Execute Predicates End |                 |           |              |                  |                     |             |              |
|---|------------------------------------|-----------------|-----------|--------------|------------------|---------------------|-------------|--------------|
| 1 | <u> 47 44</u>                      | <u>43</u>       | <u>42</u> | <u>41 36</u> | <u>35 34</u>     | <u>3316</u>         | <u>1512</u> | <u>11 0</u>  |
|   | <u>0110</u>                        | <u>Addressi</u> | Condition | RESERVED     | <u>Predicate</u> | <u>Instructions</u> | Count       | Exec Address |
|   |                                    | <u>ng</u>       |           |              | vector           | type + serialize    |             |              |
| 1 |                                    |                 |           |              |                  | (9 instructions)    |             |              |

Check the AND/OR of all current predicate bits. If AND/OR matches the condition execute the specified number of instructions. We need to AND/OR this with the kill mask in order not to consider the pixels that aren't valid. If the condition is not met, we go on to the next control flow instruction. If Conditional Execute Predicates End and the condition is met, this is the last execution block of the shader program.

|                    |                    | Loop_Star | t       |          |              |
|--------------------|--------------------|-----------|---------|----------|--------------|
| 47 4447            | 4346<br>43         | 42 17     | 20 16   | 1512     | 11 0         |
| 0111Addre<br>ssing | Addressi<br>ng0101 | RESERVED  | loop ID | RESERVED | Jump address |

Loop Start. Compares the loop iterator with the end value. If loop condition not met jump to the address. Forward jump only. Also computes the index value. The loop id must match between the start to end, and also indicates which control flow constants should be used with the loop.

|                    | Loop_End           |          |                 |         |          |               |  |  |
|--------------------|--------------------|----------|-----------------|---------|----------|---------------|--|--|
| <u>47 44</u> 47    | 4346<br>43         | 42 24    | 23 21           | 20 16   | 1512     | 11 0          |  |  |
| 1000Addre<br>ssing | Addressi<br>ng0011 | RESERVED | Predicate break | loop ID | RESERVED | start address |  |  |

Loop end. Increments the counter by one, compares the loop count with the end value. If loop condition met, continue, else, jump BACK to the start of the loop. If predicate break != 0, then compares predicate vector n (specified by predicate break number). If all bits cleared then break the loop.

The way this is described does not prevent nested loops, and the inclusion of the loop id make this easy to do.

| Conditionnal_Call  |                    |           |                 |          |            |              |  |  |
|--------------------|--------------------|-----------|-----------------|----------|------------|--------------|--|--|
| <u>47 44</u> 47    | 4346<br>43         | 42        | 41 34           | 33 13    | 12         | 11 0         |  |  |
| 1001Addre<br>ssing | Addressi<br>ng0111 | Condition | Boolean address | RESERVED | Force Call | Jump address |  |  |

If the condition is met, jumps to the specified address and pushes the control flow program counter on the stack. If force call is set the condition is ignored and the call is made always.

|                    |                    | Return   |
|--------------------|--------------------|----------|
| <u>47 44</u> 47    | 4346<br>43         | 42 0     |
| 1010Addre<br>ssing | Addressi<br>ng1000 | RESERVED |

Pops the topmost address from the stack and jumps to that address. If nothing is on the stack, the program will just continue to the next instruction.

| Conditionnal_Jump  |                    |           |                    |         |          |            |              |  |
|--------------------|--------------------|-----------|--------------------|---------|----------|------------|--------------|--|
| <u>47 44</u> 47    | 4346<br>43         | 42        | 41 34              | 33      | 32 13    | 12         | 11 0         |  |
| 1011Addre<br>ssing | Addressi<br>ng1001 | Condition | Boolean<br>address | FW only | RESERVED | Force Jump | Jump address |  |

Exhibit 2030 docR400\_Sequencer.doc 74578 Bytes\*\*\* @ ATI Confidential. Reference Copyright Notice on Cover Page @ \*\*\*



EDIT DATE
4 September, 201513

DOCUMENT-REV. NUM. GEN-CXXXXX-REVA

PAGE 23 of 53

If force jump is set the condition is ignored and the jump is made always. If FW only is set then only forward jumps are allowed.

|                 | Allocate      |               |          |                 |  |  |  |  |  |
|-----------------|---------------|---------------|----------|-----------------|--|--|--|--|--|
| <u>47 44</u> 47 | 4346<br>43    | 4241          | 40 4     | 30              |  |  |  |  |  |
| 1100Debug       | Debug10<br>10 | Buffer Select | RESERVED | Allocation size |  |  |  |  |  |

Buffer Select takes a value of the following:

- 01 position export (ordered export)
- 10 parameter cache or pixel export (ordered export)
- 11 pass thru (out of order exports).

If debug is set this is a debug alloc (ignore if debug DB\_ON register is set to off).

Marks the end of the program.

#### 6.3 Implementation

The envisioned implementation has a buffer that maintains the state of each thread. A thread lives in a given location in the buffer during its entire life, but the buffer has FIFO qualities in that threads leave in the order that they enter. Actually two buffers are maintained — one for Vertices and one for Pixels. The intended implementation would allow for:

16 entries for vertices

48 entries for pixels.

From each buffer, arbitration logic attempts to select 1 thread for the texture unit and 1 (interleaved) thread for the ALU unit. Once a thread is selected it is read out of the buffer, marked as invalid, and submitted to appropriate execution unit. It is returned to the buffer (at the same place) with its status updated once all possible sequential instructions have been executed. A switch from ALU to TEX or visa-versa or a Serialize\_Execution modifier forces the thread to be returned to the buffer.

Each entry in the buffer will be stored across two physical pieces of memory - most bits will be stored in a 1 read port device. Only bits needed for thread arbitration will be stored in a highly multi-ported structure. The bits kept in the 1 read port device will be termed 'state'. The bits kept in the multi-read ported device will be termed 'status'.

'State Bits' needed include:

- 1. Control Flow Instruction Pointer (13 bits),
- 2. Execution Count Marker 4 bits),
- 3. Loop Iterators (4x9 bits),
- 4. Call return pointers (4x12 bits),
- 5. Predicate Bits (64 bits),
- 6. Export ID (1 bit),
- 7. Parameter Cache base Ptr (7 bits),
- 8. GPR Base Ptr (8 bits),
- 9. Context Ptr (3 bits).
- 10. LOD corrections (6x16 bits)
- 11. Valid bits (64 bits)

Absent from this list are 'Index' pointers. These are costly enough that I'm presuming that they are instead stored in the GPRs. The first seven fields above (Control Flow Ptr, Execution Count, Loop Counts, call return ptrs, Predicate

Exhibit 2030 docR400\_Sequencer.doc 74578 Bytes\*\*\* © ATI Confidential. Reference Copyright Notice on Cover Page © \*\*\*



ORIGINATE DATE

EDIT DATE

R400 Sequencer Specification

PAGE

24 September, 2001 4 September, 201543

bits, PC base ptr and export ID) are updated every three the thread is returned to the buffer based on how much

bits, PC base ptr and export ID) are updated every time the thread is returned to the buffer based on how much progress has been mode on thread execution. GPR Base Ptr, Context Ptr and LOD corrections are unchanged throughout execution of the thread.

'Status Bits' needed include:

- Valid Thread
- Texture/ALU engine needed
- · Texture Reads are outstanding
- Waiting on Texture Read to Complete
- Allocation Wait (2 bits)
- 00 No allocation needed
- 01 Position export allocation needed (ordered export)
- 10 Parameter or pixel export needed (ordered export)
- 11 pass thru (out of order export)
- Allocation Size (4 bits)
- Position Allocated
- · First thread of a new context
- Event thread (NULL thread that needs to trickle down the pipe)
- Last (1 bit)
- Pulse SX (1 bit)

All of the above fields from all of the entries go into the arbitration circuitry. The arbitration circuitry will select a winner for both the Texture Engine and for the ALU engine. There are actually two sets of arbitration -- one for pixels and one for vertices. A final selection is then done between the two. But the rest of this implementation summary only considers the 'first' level selection which is similar for both pixels and vertices.

Texture arbitration requires no allocation or ordering so it is purely based on selecting the 'oldest' thread that requires the Texture Engine.

ALU arbitration is a little more complicated. First, only threads where either of Texture\_Reads\_outstanding or Waiting\_on\_Texture\_Read\_to\_Complete are '0' are considered. Then if Allocation\_Wait is active, these threads are further filtered based on whether space is available. If the allocation is position allocation, then the thread is only considered if all 'older' threads have already done their position allocation (position allocated bits set). If the allocation is parameter or pixel allocation, then the thread is only considered if it is the oldest thread. Also a thread is not considered if it is a parameter or pixel or position allocation, has its First\_thread\_of\_a\_new\_context bit set and would cause ALU interleaving with another thread performing the same parameter or pixel or position allocation. Finally the 'oldest' of the threads that pass through the above filters is selected. If the thread needed to allocate, then at this time the allocation is done, based on Allocation\_Size. If a thread has its "last" bit set, then it is also removed from the buffer, never to return.

If I now redefine 'clauses' to mean 'how many times the thread is removed from the thread buffer for the purpose of exection by either the ALU or Texture engine', then the minimum number of clauses needed is 2 — one to perform the allocation for exports (execution automatically halts after an 'Alloc' instruction) (but doesn't performs the actual allocation) and one for the actual ALU/export instructions. As the 'Alloc' instruction could be part of a texture clause (presumably the final instruction in such a clause), a thread could still execute in this minimal number of 2 clauses, even if it involved texture fetching.

The Texture\_Reads\_Outstanding bit must be updated by the sequencer, based on keeping track of how many Texture Clauses have been executed by a given thread that have not yet had there data returned. Any number above 0 results in this bit being set. We could consider forcing synchronization such that two texture clauses for a given thread may not be outstanding at any time (that would be my preference for simplicity reasons and because it would require only very little change in the texture pipe interface). This would allow the sequencer to set the bit on execution of the texture clause, and allow the texture unit to return a pointer to the thread buffer on completion that clears the bit.

Exhibit 2030.docR400\_Sequencer.doc 74578 Bytes\*\*\* © ATI Confidential. Reference Copyright Notice on Cover Page © \*\*\*



EDIT DATE

4 September, 201513

DOCUMENT-REV. NUM. GEN-CXXXXX-REVA

PAGE 25 of 53

#### 6.4 Data dependant predicate instructions

Data dependant conditionals will be supported in the R400. The only way we plan to support those is by supporting three vector/scalar predicate operations of the form:

PRED\_SETE\_# - similar to SETE except that the result is 'exported' to the sequencer.

PRED\_SETNE\_# - similar to SETNE except that the result is 'exported' to the sequencer.

PRED\_SETGT\_# - similar to SETGT except that the result is 'exported' to the sequencer

PRED\_SETGTE\_# - similar to SETGTE except that the result is 'exported' to the sequencer

For the scalar operations only we will also support the two following instructions:

PRED\_SETE0\_# - SETE0 PRED\_SETE1\_# - SETE1

The export is a single bit - 1 or 0 that is sent using the same data path as the MOVA instruction. The sequencer will maintain 4 sets of 64 bit predicate vectors (in fact 8 sets because we interleave two programs but only 4 will be exposed) and use it to control the write masking. This predicate is not maintained across clause boundaries. The # sign is used to specify which predicate set you want to use 0 thru 3.

Then we have two conditional execute bits. The first bit is a conditional execute "on" bit and the second bit tells us if we execute on 1 or 0. For example, the instruction:

P0 ADD #R0,R1,R2

Is only going to write the result of the ADD into those GPRs whose predicate bit is 0. Alternatively, P1\_ADD\_# would only write the results to the GPRs whose predicate bit is set. The use of the P0 or P1 without precharging the sequencer with a PRED instruction is undefined.

{Issue: do we have to have a NOP between PRED and the first instruction that uses a predicate?}

## 6.5 HW Detection of PV,PS

Because of the control program, the compiler cannot detect statically dependant instructions. In the case of non-masked writes and subsequent reads the sequencer will insert uses of PV,PS as needed. This will be done by comparing the read address and the write address of consecutive instructions. For masked writes, the sequencer will insert NOPs wherever there is a dependant read/write.

The sequencer will also have to insert NOPs between PRED SET and MOVA instructions and their uses.

## 6.6 Register file indexing

Because we can have loops in fetch clause, we need to be able to index into the register file in order to retrieve the data created in a fetch clause loop and use it into an ALU clause. The instruction will include the base address for register indexing and the instruction will contain these controls:

| Bit7 | Bit 6 |                     |
|------|-------|---------------------|
| 0    | 0     | 'absolute register' |
| 0    | 1     | 'relative register' |
| 1    | 0     | 'previous vector'   |
| 1    | 1     | 'previous scalar'   |

In the case of an absolute register we just take the address as is. In the case of a relative register read we take the base address and we add to it the loop index and this becomes our new address that we give to the shader pipe.

The sequencer is going to keep a loop index computed as such:

Index = Loop\_iterator\*Loop\_step + Loop\_start.

We loop until loop\_iterator = loop\_count. Loop\_step is a signed value [-128...127]. The computed index value is a 10 bit counter that is also signed. Its real range is [-256,256]. The tenth bit is only there so that we can provide an out of

Exhibit 2030.docR400\_Sequencer.doc 74578 Bytes\*\*\* © ATI Confidential. Reference Copyright Notice on Cover Page © \*\*\*



EDIT DATE
4 September, 201543

R400 Sequencer Specification

PAGE 26 of 53

range value to the "indexing logic" so that it knows when the provided index is out of range and thus can make the necessary arrangements.

## 6.7 Debugging the Shaders

In order to be able to debug the pixel/vertex shaders efficiently, we provide 2 methods.

#### 6.7.1 Method 1: Debugging registers

Current plans are to expose 2 debugging, or error notification, registers:

- 1. address register where the first error occurred
- 2. count of the number of errors

The sequencer will detect the following groups of errors:

- count overflow
- constant indexing overflow
- register indexing overflow

Compiler recognizable errors:

- jump errors
  - relative jump address > size of the control flow program
- call stack

call with stack full return with stack empty

A jump error will always cause the program to break. In this case, a break means that a clause will halt execution, but allowing further clauses to be executed.

With all the other errors, program can continue to run, potentially to worst-case limits. The program will only break if the DB\_PROB\_BREAK register is set.

If indexing outside of the constant or the register range, causing an overflow error, the hardware is specified to return the value with an index of 0. This could be exploited to generate error tokens, by reserving and initializing the 0th register (or constant) for errors.

{ISSUE : Interrupt to the driver or not?}

#### 6.7.2 Method 2: Exporting the values in the GPRs

1) The sequencer will have a debug active, count register and an address register for this mode.

Under the normal mode execution follows the normal course.

Under the debug mode it is assumed that the program is always exporting n debug vectors and that all other exports to the SX block (position, color, z, ect) will been turned off (changed into NOPs) by the sequencer (even if they occur before the address stated by the ADDR debug register).

#### 7. Pixel Kill Mask

A vector of 64 bits is kept by the sequencer per group of pixels/vertices. Its purpose is to optimize the texture fetch requests and allow the shader pipe to kill pixels using the following instructions:

MASK\_SETE MASK\_SETNE MASK\_SETGT MASK\_SETGTE

Exhibit 2030. doc R400\_Sequencer.doc 74578 Bytes\*\*\* © ATI Confidential. Reference Copyright Notice on Cover Page © \*\*\*



EDIT DATE
4 September, 201543

DOCUMENT-REV. NUM. GEN-CXXXXX-REVA

PAGE 27 of 53

### 8. Multipass vertex shaders (HOS)

Multipass vertex shaders are able to export from the 6 last clauses but to memory ONLY.

## 9. Register file allocation

The register file allocation for vertices and pixels can either be static or dynamic. In both cases, the register file in managed using two round robins (one for pixels and one for vertices). In the dynamic case the boundary between pixels and vertices is allowed to move, in the static case it is fixed to 128-VERTEX\_REG\_SIZE for vertices and PIXEL\_REG\_SIZE for pixels.

Exhibit 2030.docR400\_Sequencer.doc 74578 Bytes\*\*\* © ATI Confidential. Reference Copyright Notice on Cover Page @ \*\*\*



Above is an example of how the algorithm works. Vertices come in from top to bottom; pixels come in from bottom to top. Vertices are in orange and pixels in green. The blue line is the tail of the vertices and the green line is the tail of the pixels. Thus anything between the two lines is shared. When pixels meets vertices the line turns white and the boundary is static until both vertices and pixels share the same "unallocated bubble". Then the boundary is allowed to move again. The numbering of the GPRs starts from the bottom of the picture at index 0 and goes up to the top at index 127.

## 10. Fetch Arbitration

The fetch arbitration logic chooses one of the 8 potentially pending fetch clauses to be executed. The choice is made by looking at the fifos from 7 to 0 and picking the first one ready to execute. Once chosen, the clause state machine will send one 2x2 fetch per clock (or 4 fetches in one clock every 4 clocks) until all the fetch instructions of the clause are sent. This means that there cannot be any dependencies between two fetches of the same clause.

The arbitrator will not wait for the fetches to return prior to selecting another clause for execution. The fetch pipe will be able to handle up to X(?) in flight fetches and thus there can be a fair number of active clauses waiting for their fetch return data.

## 11. ALU Arbitration

ALU arbitration proceeds in almost the same way than fetch arbitration. The ALU arbitration logic chooses one of the 8 potentially pending ALU clauses to be executed. The choice is made by looking at the fifos from 7 to 0 and picking the first one ready to execute. There are two ALU arbiters, one for the even clocks and one for the odd clocks. For example, here is the sequencing of two interleaved ALU clauses (E and O stands for Even and Odd sets of 4 clocks):

Einst0 Oinst0 Einst1 Oinst1 Einst2 Oinst2 Einst0 Oinst3 Einst1 Oinst4 Einst2 Oinst0...

Proceeding this way hides the latency of 8 clocks of the ALUs. Also note that the interleaving also occurs across clause boundaries.

Exhibit 2030. doc R400\_Sequencer.doc 74578 Bytes\*\*\* © ATI Confidential. Reference Copyright Notice on Cover Page © \*\*\*



EDIT DATE

4 September, 201513

DOCUMENT-REV. NUM. GEN-CXXXXX-REVA PAGE 29 of 53

## 12. Handling Stalls

When the output file is full, the sequencer prevents the ALU arbitration logic from selecting the last clause (this way nothing can exit the shader pipe until there is place in the output file. If the packet is a vertex packet and the position buffer is full (POS\_FULL) then the sequencer also prevents a thread from entering the exporting clause (3?). The sequencer will set the OUT\_FILE\_FULL signal n clocks before the output file is actually full and thus the ALU arbiter will be able read this signal and act accordingly by not preventing exporting clauses to proceed.

#### 13. Content of the reservation station FIFOs

The reservation FIFOs contain the state of the vector of pixels and vertices. We have two sets of those: one for pixels, and one for vertices. They contain 3 bits of Render State 7 bits for the base address of the GPRs, some bits for LOD correction and coverage mask information in order to fetch fetch for only valid pixels, the quad address.

#### 14. The Output File

The output file is where pixels are put before they go to the RBs. The write BW to this store is 256 bits/clock. Just before this output file are staging registers with write BW 512 bits/clock and read BW 256 bits/clock. The staging registers are 4x128 (and there are 16 of those on the whole chip).

#### 15. IJ Format

The IJ information sent by the PA is of this format on a per quad basis:

We have a vector of IJ's (one IJ per pixel at the centroid of the fragment or at the center of the pixel depending on the mode bit). The interpolation is done at a different precision across the 2x2. The upper left pixel's parameters are always interpolated at full 20x24 mantissa precision. Then the result of the interpolation along with the difference in IJ in reduced precision is used to interpolate the parameter for the other three pixels of the 2x2. Here is how we do it:

Assuming P0 is the interpolated parameter at Pixel 0 having the barycentric coordinates I(0), J(0) and so on for P1,P2 and P3. Also assuming that A is the parameter value at V0 (interpolated with I), B is the parameter value at V1 (interpolated with J) and C is the parameter value at V2 (interpolated with (1-I-J).

$$\Delta 01I = I(1) - I(0)$$

$$\Delta 01J = J(1) - J(0)$$

$$\Delta 02I = I(2) - I(0)$$

$$\Delta 02J = J(2) - J(0)$$

$$\Delta 03I = I(3) - I(0)$$

 $\Delta 03J = J(3) - J(0)$ 

| P0 | P1 |
|----|----|
| P2 | P3 |

$$P0 = C + I(0) * (A - C) + J(0) * (B - C)$$

$$P1 = P0 + \Delta 01I * (A - C) + \Delta 01J * (B - C)$$

$$P2 = P0 + \Delta 02I * (A - C) + \Delta 02J * (B - C)$$

$$P3 = P0 + \Delta 03I * (A - C) + \Delta 03J * (B - C)$$

P0 is computed at 20x24 mantissa precision and P1 to P3 are computed at 8X24 mantissa precision. So far no visual degradation of the image was seen using this scheme.

Multiplies (Full Precision): 2 Multiplies (Reduced precision): 6 Subtracts 19x24 (Parameters): 2

Exhibit 2030 docR400\_Sequencer.doc 74578 Bytes\*\*\* © ATI Confidential. Reference Copyright Notice on Cover Page © \*\*\*



EDIT DATE
4 September, 201513

R400 Sequencer Specification

PAGE 30 of 53

Adds: 8

FORMAT OF P0's IJ: Mantissa 20 Exp 4 for I + Sign Mantissa 20 Exp 4 for J + Sign

FORMAT of Deltas (x3): Mantissa 8 Exp 4 for I + Sign Mantissa 8 Exp 4 for J + Sign

Total number of bits:  $20^2 + 8^6 + 4^8 + 4^2 = 128$ 

All numbers are kept using the un-normalized floating point convention: if exponent is different than 0 the number is normalized if not, then the number is un-normalized. The maximum range for the IJs (Full precision) is +/- 63 and the range for the Deltas is +/- 127.

#### 15.1 Interpolation of constant attributes

Because of the floating point imprecision, we need to take special provisions if all the interpolated terms are the same or if two of the barycentric coordinates are the same.

We start with the premise that if A = B and B = C and C = A, then P0,1,2,3 = A. Since one or more of the IJ terms may be zero, so we extend this to:

## 16. Staging Registers

In order for the reuse of the vertices to be 14, the sequencer will have to re-order the data sent IN ORDER by the VGT for it to be aligned with the parameter cache memory arrangement. Given the following group of vertices sent by the VGT:

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 || 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 || 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 || 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63

The sequencer will re-arrange them in this fashion:

0 1 2 3 16 17 18 19 32 33 34 35 48 49 50 51  $\parallel$  4 5 6 7 20 21 22 23 36 37 38 39 52 53 54 55  $\parallel$  8 9 10 11 24 25 26 27 40 41 42 43 56 57 58 59  $\parallel$  12 13 14 15 28 29 30 31 44 45 46 47 60 61 62 63

The || markers show the SP divisions. In the event a shader pipe is broken, the VGT will send padding to account for the missing pipe. For example, if SP1 is broken, vertices 4 5 6 7 20 21 22 23 36 37 38 39 52 53 54 55 will still be sent by the VGT to the SQ **BUT** will not be processed by the SP and thus should be considered invalid (by the SU and VGT).

Exhibit 2030. doc R400\_Sequencer.doc 74578 Bytes\*\*\* © ATI Confidential. Reference Copyright Notice on Cover Page © \*\*\*



EDIT DATE

4 September, 201543

DOCUMENT-REV. NUM. GEN-CXXXXX-REVA PAGE 31 of 53

The most straightforward, *non-compressed* interface method would be to convert, in the VGT, the data to 32-bit floating point prior to transmission to the VSISRs. In this scenario, the data would be transmitted to (and stored in) the VSISRs in full 32-bit floating point. This method requires three 24-bit fixed-to-float converters in the VGT. Unfortunately, it also requires and additional 3,072 bits of storage across the VSISRs. This interface is illustrated in Figure 11Figure 11. The area of the fixed-to-float converters and the VSISRs for this method is roughly estimated as 0.759sqmm using the R300 process. The gate count estimate is shown in Figure 10Figure 10.

| Basis for 8-deep Latch Memory (fron                                     | n R300)       |                           |                               |
|-------------------------------------------------------------------------|---------------|---------------------------|-------------------------------|
| 8x24-bit                                                                | 11631         | $\mu^2$                   | $60.57813\mu^2\text{per bit}$ |
| Area of 96x8-deep Latch Memory<br>Area of 24-bit Fix-to-float Converter | 46524<br>4712 | $\mu^2 \ \mu^2$ per conve | erter                         |
| Method 1                                                                | Block         | Quantity                  | Area                          |
|                                                                         | F2F           | 3                         | 14136                         |
|                                                                         | 8x96 Latch    | 16                        | 744384                        |
|                                                                         |               |                           | 758520 μ²                     |

Figure 10: Area Estimate for VGT to Shader Interface



Figure 11:VGT to Shader Interface

## 17. The parameter cache

The parameter cache is where the vertex shaders export their data. It consists of 16 128x128 memories (1R/1W). The reuse engine will make it so that all vertexes of a given primitive will hit different memories. The allocation method for these memories is a simple round robin. The parameter cache pointers are mapped in the following way: 4MSBs are the memory number and the 7 LSBs are the address within this memory.

| MEMORY NUMBER | ADDRESS |
|---------------|---------|
| 4 bits        | 7 bits  |

The PA generates the parameter cache addresses as the positions come from the SQ. All it needs to do is keep a Current\_Location pointer (7 bits only) and as the positions comes increment the memory number. When the memory number field wraps around, the PA increments the Current\_Location by VS\_EXPORT\_COUNT (a snooped register from the SQ). As an example, say the memories are all empty to begin with and the vertex shader is exporting 8 parameters per vertex (VS\_EXPORT\_COUNT = 8). The first position received is going to have the PC address 000000000000 the second one 00010000000, third one 00100000000 and so on up to 11110000000. Then the next position received (the 17<sup>th</sup>) is going to have the address 00000001000, the 18<sup>th</sup> 00010001000, the 19<sup>th</sup> 00100001000 and so on. The Current\_location is NEVER reset BUT on chip resets. The only thing to be careful about is that if the SX doesn't send you a full group of positions (<64) then you need to fill the address space so that the next group starts correctly aligned (for example if you receive only 33 positions then you need to add 2\*VS\_EXPORT\_COUNT to Current\_Location and reset the memory count to 0 before the next vector begins).

Exhibit 2030. doc R400\_Sequencer.doc 74578 Bytes\*\*\* © ATI Confidential. Reference Copyright Notice on Cover Page © \*\*\*



EDIT DATE
4 September, 201513

DOCUMENT-REV. NUM.
GEN-CXXXXX-REVA

PAGE 33 of 53

17.1 Export restrictions

#### 17.1.1 Pixel exports:

Pixels can export 1,2,3 or 4 color buffers to the SX(+z). The exports will be done in order. The PRED\_OPTIMIZE function has to be turned of if the exports are done using interleaved predicated instructions. The exports will always be ordered to the SX.

#### 17.1.2 Vertex exports:

Position or parameter caches can be exported in any order in the shader program. It is always better to export posistion as soon as possible. Position has to be exported in a single export block (no texture instructions can be placed between the exports). Parameter cache exports can be done in any order with texture instructions interleaved. The PRED\_OPTIMIZE function has to be turned of if the exports are done using interleaved predicated instructions to the Parameter cache (see Arbitration restrictions for details). The exports will always be allocated in order to the SX.

#### 17.1.3 Pass thru exports:

Pass thru exports have to be done in groups of the form:

```
Alloc 4 (8 or 12)
Execute ALU(ADDR) ALU(DATA) ALU(DATA) ALU(DATA)...
```

They cannot have texture instructions interleaved in the export block. These exports are not guaranteed to be ordered.

Also, when doing a pass thru export, Position MUST be exported AFTER all pass thru exports. This position export is used to synchronize the chip when doing a transition from pass thru shader to regular shader and vice versa.

#### 17.2 Arbitration restrictions

Here are the Sequencer arbitration restrictions:

- 1) Cannot execute a serialized thread if the corresponding texture pending bit is set
- 2) Cannot allocate position if any older thread has not allocated position
- 3) If last thread is marked as not valid AND marked as last and we are about to execute the second to oldest thread also marked last then:
  - a. Both threads must be from the same context (cannot allow a first thread)
  - b. Must turn off the predicate optimization for the second thread
- 4) Cannot execute a texture clause if texture reads are pending
- 5) Cannot execute last if texture pending (even if not serial)

## 18. Export Types

The export type (or the location where the data should be put) is specified using the destination address field in the ALU instruction. Here is a list of all possible export modes:

## 18.1 Vertex Shading

0:15 - 16 parameter cache 16:31 - Empty (Reserved?)

32 - Export Address

33:40 - 8 vertex exports to the frame buffer and index

41:47 - Empty

48:55 - 8 debug export (interpret as normal vertex export)

60 - export addressing mode

61 - Empty 62 - position

Exhibit 2030.docR400\_Sequencer.doc 74578 Bytes\*\*\* © ATI Confidential. Reference Copyright Notice on Cover Page @ \*\*\*



EDIT DATE
4 September, 201543

R400 Sequencer Specification

PAGE 34 of 53

- sprite size export that goes with position export (point h,point w,edgeflag,misc)

### 18.2 Pixel Shading

0 - Color for buffer 0 (primary)

1 - Color for buffer 1

2 - Color for buffer 2

3 - Color for buffer 3

4:7 - Empty

8 - Buffer 0 Color/Fog (primary)

9 - Buffer 1 Color/Fog

10 - Buffer 2 Color/Fog11 - Buffer 3 Color/Fog

12:15 - Empty

16:31 - Empty (Reserved?)

32 - Export Address

33:40 - 8 exports for multipass pixel shaders.

41:47 - Empty

48:55 - 8 debug exports (interpret as normal pixel export)

60 - export addressing mode

61:62 - Empty

- Z for primary buffer (Z exported to 'alpha' component)

#### 19. Special Interpolation modes

#### 19.1 Real time commands

We are unable to use the parameter memory since there is no way for a command stream to write into it. Instead we need to add three 16x128 memories (one for each of three vertices x 16 interpolants). These will be mapped onto the register bus and written by type 0 packets, and output to the the parameter busses (the sequencer and/or PA need to be able to address the reatime parameter memory as well as the regular parameter store. For higher performance we should be able able to view them as two banks of 16 and do double buffering allowing one to be loaded, while the other is rasterized with. Most overlay shaders will need 2 or 4 scalar coordinates, one option might be to restrict the memory to 16x64 or 32x64 allowing only two interpolated scalars per cycle, the only problem I see with this is, if we view support for 16 vector-4 interpolants important (true only if we map Microsoft's high priority stream to the realtime stream), then the PA/sequencer need to support a realtime-specific mode where we need to address 32 vectors of parameters instead of 16. This mode is triggered by the primitive type: REAL TIME. The actual memories are in the in the SX blocks. The parameter data memories are hooked on the RBBM bus and are loaded by the CP using register mapped memory.

## 19.2 Sprites/ XY screen coordinates/ FB information

When working with sprites, one may want to overwrite the parameter 0 with SC generated data. Also, XY screen coordinates may be needed in the shader program. This functionality is controlled by the gen\_I0 register (in SQ) in conjunction with the SND\_XY register (in SC). Also it is possible to send the faceness information (for OGL front/back special operations) to the shader using the same control register. Here is a list of all the modes and how they interact together:

Gen\_st is a bit taken from the interface between the SC and the SQ. This is the MSB of the primitive type. If the bit is set, it means we are dealing with Point AA, Line AA or sprite and in this case the vertex values are going to generated between 0 and 1.

```
Param_Gen_I0 disable, snd_xy disable, no gen_st - I0 = No modification
Param_Gen_I0 disable, snd_xy disable, gen_st - I0 = No modification
Param_Gen_I0 disable, snd_xy enable, no gen_st - I0 = No modification
Param_Gen_I0 disable, snd_xy enable, gen_st - I0 = No modification
Param_Gen_I0 enable, snd_xy disable, no gen_st - I0 = garbage, garbage, garbage, faceness
```

Exhibit 2030. doc R400\_Sequencer.doc 74578 Bytes\*\*\* © ATI Confidential. Reference Copyright Notice on Cover Page © \*\*\*



**EDIT DATE** 4 September, 201513 DOCUMENT-REV. NUM. GEN-CXXXXX-REVA

**PAGE** 35 of 53

Param\_Gen\_I0 enable, snd\_xy disable, gen\_st - I0 = garbage, garbage, s, t Param\_Gen\_I0 enable, snd\_xy enable, no gen\_st - I0 = screen x, screen y, garbage, faceness

Param\_Gen\_I0 enable, snd\_xy enable, gen\_st - I0 = screen x, screen y, s, t

#### 19.3 Auto generated counters

In the cases we are dealing with multipass shaders, the sequencer is going to generate a vector count to be able to both use this count to write the 1st pass data to memory and then use the count to retrieve the data on the 2nd pass. The count is always generated in the same way but it is passed to the shader in a slightly different way depending on the shader type (pixel or vertex). This is toggled on and off using the GEN\_INDEX register. The sequencer is going to keep two counters, one for pixels and one for vertices. Every time a full vector of vertices or pixels is written to the GPRs the counter is incremented. Every time a state change is detected, the corresponding counter is reset. While there is only one count broadcast to the GPRs, the LSB are hardwired to specific values making the index different for all elements in the vector.

#### 19.3.1 Vertex shaders

In the case of vertex shaders, if GEN\_INDEX is set, the data will be put into the x field of the third register (it means that the compiler must allocate 3 GPRs in all multipass vertex shader modes).

#### 19.3.2 Pixel shaders

In the case of pixel shaders, if GEN\_INDEX is set and Param\_Gen\_I0 is enabled, the data will be put in the x field of the 2<sup>nd</sup> register (R1.x), else if GEN\_INDEX is set the data will be put into the x field of the 1<sup>st</sup> register (R0.x).



Figure 12: GPR input mux Control

## 20. State management

Every clock, the sequencer will report to the CP the oldest states still in the pipe. These are the states of the programs as they enter the last ALU clause.

## 20.1 Parameter cache synchronization

In order for the sequencer not to begin a group of pixels before the associated group of vertices has finished, the sequencer will keep a 6 bit count per state (for a total of 8 counters). These counters are initialized to 0 and every

Exhibit 2030.docR400\_Sequencer.doc 74578 Bytes\*\*\* @ ATI Confidential. Reference Copyright Notice on Cover Page @ \*\*\*



EDIT DATE
4 September, 201543

R400 Sequencer Specification

PAGE 36 of 53

time a vertex shader exports its data TO THE PARAMETER CACHE, the corresponding pointer is incremented. When the SC sends a new vector of pixels with the SC\_SQ\_new\_vector bit asserted, the sequencer will first check if the count is greater than 0 before accepting the transmission (it will in fact accept the transmission but then lower its ready to receive). Then the sequencer waits for the count to go to one and decrements it. The sequencer can then issue the group of pixels to the interpolators. Every time the state changes, the new state counter is initialized to 0.

#### 21. XY Address imports

The SC will be able to send the XY addresses to the GPRs. It does so by interleaving the writes of the IJs (to the IJ buffer) with XY writes (to the XY buffer). Then when writing the data to the GPRs, the sequencer is going to interpolate the IJ data or pass the XY data thru a Fix→float converter and expander and write the converted values to the GPRs. The Xys are currently SCREEN SPACE COORDINATES. The values in the XY buffers will wrap. See section 19.2 for details on how to control the interpolation in this mode.

#### 21.1 Vertex indexes imports

In order to import vertex indexes, we have 16 8x96 staging registers. These are loaded one line at a time by the VGT block (96 bits). They are loaded in floating point format and can be transferred in 4 or 8 clocks to the GPRs.

#### 22. Registers

#### 22.1 Control

REG DYNAMIC Dynamic allocation (pixel/vertex) of the register file on or off.

REG SIZE PIX Size of the register file's pixel portion (minimal size when dynamic allocation turned

on)

REG\_SIZE\_VTX Size of the register file's vertex portion (minimal size when dynamic allocation turned

on)

ARBITRATION\_POLICY policy of the arbitration between vertexes and pixels

INST\_BASE\_VTX start point for the vertex instruction store (RT always ends at vertex\_base and

Begins at 0)

INST\_BASE\_PIX start point for the pixel shader instruction store

ONE\_THREAD debug state register. Only allows one program at a time into the GPRs

ONE\_ALU debug state register. Only allows one ALU program at a time to be executed (instead

of 2)

INSTRUCTION This is where the CP puts the base address of the instruction writes and type (auto-

incremented on reads/writes) Register mapped

CONSTANTS 512\*4 ALU constants + 32\*6 Texture state 32 bits registers (logically mapped)

CONSTANTS\_RT 256\*4 ALU constants + 32\*6 texture states? (physically mapped)

CONSTANT\_EO\_RT

This is the size of the space reserved for real time in the constant store (from 0 to CONSTANT\_EO\_RT). The re-mapping table operates on the rest of the memory

This is the size of the space reserved for real time in the fetch state store (from 0 to

TSTATE\_EO\_RT This is the size of the space reserved for real time TSTATE\_EO\_RT). The re-mapping table operates on the rest of the memory

#### 22.2 Context

PS\_BASE base pointer for the pixel shader in the instruction store
VS\_BASE base pointer for the vertex shader in the instruction store
VS\_CF\_SIZE size of the vertex shader (# of instructions in control program/2)
PS\_CF\_SIZE size of the pixel shader (# of instructions in control program/2)

PS\_SIZE size of the pixel shader (cntl+instructions)
VS\_SIZE size of the vertex shader (cntl+instructions)

PS\_NUM\_REG number of GPRs to allocate for pixel shader programs VS\_NUM\_REG number of GPRs to allocate for vertex shader programs

PARAM\_SHADE One 16 bit register specifying which parameters are to be gouraud shaded (0 = flat, 1

= gouraud)

PARAM WRAP 64 bits: for which parameters (and channels (xyzw)) do we do the cyl wrapping

(0=linear, 1=cylindrical).

PS\_EXPORT\_MODE 0xxxx : Normal mode

Exhibit 2030.docR400\_Sequencer.doc 74578 Bytes\*\*\* © ATI Confidential. Reference Copyright Notice on Cover Page © \*\*\*



ORIGINATE DATE

**EDIT DATE** 

DOCUMENT-REV. NUM.

**PAGE** 

24 September, 2001 4 September, 201513 GEN-CXXXXX-REVA

37 of 53

1xxxx: Multipass mode

If normal, bbbz where bbb is how many colors (0-4) and z is export z or not

If multipass 1-12 exports for color.

VS EXPORT MODE

VS\_EXPORT\_COUNT

0: position (1 vector), 1: position (2 vectors), 3:multipass

Number of locations exported by the VS (and thus number of interpolated

parameters) PARAM\_GEN\_I0 GEN INDEX

Do we overwrite or not the parameter 0 with XY data and generated T and S values Auto generates an address from 0 to XX. Puts the results into R0-1 for pixel shaders

and R2 for vertex shaders

CONST\_BASE\_VTX (9 bits)Logical Base address for the constants of the Vertex shader CONST\_BASE\_PIX (9 bits) Logical Base address for the constants of the Pixel shader

CONST\_SIZE\_PIX (8 bits) Size of the logical constant store for pixel shaders CONST\_SIZE\_VTX (8 bits) Size of the logical constant store for vertex shaders

INST PRED OPTIMIZE Turns on the predicate bit optimization (if of, conditional execute predicates is

always executed).

CF BOOLEANS 256 boolean bits

CF LOOP COUNT 32x8 bit counters (number of times we traverse the loop) CF\_LOOP\_START 32x8 bit counters (init value used in index computation) CF\_LOOP\_STEP 32x8 bit counters (step value used in index computation)

#### DEBUG Registers

#### 23.1 Context

DB PROB ADDR instruction address where the first problem occurred

DB\_PROB\_COUNT number of problems encountered during the execution of the program

DB\_PROB\_BREAK break the clause if an error is found. DB ON turns on an off debug method 2 DB\_INST\_COUNT instruction counter for debug method 2 DB BREAK ADDR break address for method number 2

#### 23.2 Control

DB\_ALUCST\_MEMSIZE Size of the physical ALU constant memory DB\_TSTATE\_MEMSIZE Size of the physical texture state memory

## 24. Interfaces

#### 24.1 External Interfaces

Whenever an x is used, it means that the bus is broadcast to all units of the same name. For example, if a bus is named SQ-SPx it means that SQ is going to broadcast the same information to all SP instances.

#### 24.2 SC to SP Interfaces

## 24.2.1 SC SP#

There is one of these interfaces at front of each of the SP (buffer to stage pixel interpolators). This interface transmits the I,J data for pixel interpolation. For the entire system, two quads per clock are transferred to the 4 SPs, so each of these 4 interfaces transmits one half of a quad per clock. The interface below describes a half of a quad worth of

The actual data which is transferred per quad is Ref Pix I => S4.20 Floating Point I value Ref Pix J => S4.20 Floating Point J value

Exhibit 2030 docR400\_Sequencer.doc 74578 Bytes\*\*\* @ ATI Confidential. Reference Copyright Notice on Cover Page @ \*\*\*



EDIT DATE
4 September, 201543

R400 Sequencer Specification

PAGE

38 of 53

Delta Pix I (x3) => S4.8 Floating Point Delta I value
Delta Pix J (x3) => S4.8 Floating Point Delta J value
This equates to a total of 128 bits which transferred over 2 clocks
and therefor needs an interface 64 bits wide

Additionally, X,Y data (12-bit unsigned fixed) is conditionally sent across this data bus over the same wires in an additional clock. The X,Y data is sent on the lower 24 bits of the data bus with faceness in the msb. Transfers across these interfaces are synchronized with the SC\_SQ IJ Control Bus transfers.

The data transfer across each of these busses is controlled by a IJ\_BUF\_INUSE\_COUNT in the SC. Each time the SC has sent a pixel vector's worth of data to the SPs, he will increment the IJ\_BUF\_INUSE\_COUNT count. Prior to sending the next pixel vectors data, he will check to make sure the count is less than MAX\_BUFER\_MINUS\_2, if not the SC will stall until the SQ returns a pipelined pulse to decrement the count when he has scheduled a buffer free. Note: We could/may optimize for the case of only sending only IJ to use all the buffers to pre-load more. Currently it is planned for the SP to hold 2 double buffers of I,J data and two buffers of X,Y data, so if either X,Y or Centers and Centroids are on, then the SC can send two Buffers.

In at least the initial version, the SC shall send 16 quads per pixel vector even if the vector is not full. This will increment buffer write address pointers correctly all the time. (We may revisit this for both the SX,SP,SQ and add a EndOfVector signal on all interfaces to quit early. We opted for the simple mode first with a belief that only the end of packet and multiple new vector signals should cause a partial vector and that this would not really be significant performance hit.)

| Name                  | Bits | Description                                                                      |  |  |  |  |
|-----------------------|------|----------------------------------------------------------------------------------|--|--|--|--|
| SC SP# data           | 64   | IJ information sent over 2 clocks (or X,Y in 24 LSBs with faceness in upper bit) |  |  |  |  |
|                       |      | Type 0 or 1, First clock I, second clk J                                         |  |  |  |  |
|                       |      | Field ULC URC LLC LRC                                                            |  |  |  |  |
|                       |      | Bits [63:39] [38:26] [25:13] [12:0]                                              |  |  |  |  |
|                       |      | Format SE4M20 SE4M8 SE4M8 SE4M8                                                  |  |  |  |  |
|                       |      | Type 2                                                                           |  |  |  |  |
|                       |      | Field Face X Y                                                                   |  |  |  |  |
|                       |      | Bits [63] [23:12] [11:0]                                                         |  |  |  |  |
|                       |      | Format Bit Unsigned Unsigned                                                     |  |  |  |  |
| SC_SP#_valid          | 1    | Valid                                                                            |  |  |  |  |
| SC_SP#_last_quad_data | 1    | This bit will be set on the last transfer of data per quad.                      |  |  |  |  |
| SC_SP#_type           | 2    | 0 -> Indicates centroids                                                         |  |  |  |  |
|                       |      | 1 -> Indicates centers                                                           |  |  |  |  |
|                       |      | 2 -> Indicates X,Y Data and faceness on data bus                                 |  |  |  |  |
|                       |      | The SC shall look at state data to determine how many types to send for the      |  |  |  |  |
|                       |      | interpolation process.                                                           |  |  |  |  |

The # is included for clarity in the spec and will be replaced with a prefix of u#\_ in the verilog module statement for the SC and the SP block will have neither because the instantiation will insert the prefix.

#### 24.2.2 SC\_SQ

This is the control information sent to the sequencer in order to synchronize and control the interpolation and/or loading data into the GPRs needed to execute a shader program on the sent pixels. This data will be sent over two clocks per transfer with 1 to 16 transfers. Therefore the bus (approx 92-94 bits) could be folded in half to approx 47-49 bits.

| Name       | Bits | Description                                |  |  |  |
|------------|------|--------------------------------------------|--|--|--|
| SC_SQ_data | 46   | Control Data sent to the SQ                |  |  |  |
|            |      | 1 clk transfers                            |  |  |  |
|            |      | Event – valid data consist of event id and |  |  |  |
|            |      | state_id. Instruct SQ to post an           |  |  |  |
|            |      | event vector to send state id and          |  |  |  |
|            |      | event_id through request fifo              |  |  |  |

Exhibit 2030. doc R400\_Sequencer.doc 74578 Bytes\*\*\* © ATI Confidential. Reference Copyright Notice on Cover Page © \*\*\*

|             | ORIGINATE I  | DATE   | EDIT DATE                                                                                                                                                                                                                                                                                                        | DOCUMENT-REV. NUM.                                                                                                                                                                                                                                                                                                                                                                                                                                                                        | PAGE     |
|-------------|--------------|--------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------|
| <b>600</b>  | 24 September | , 2001 | 4 September, 201513                                                                                                                                                                                                                                                                                              | GEN-CXXXXX-REVA                                                                                                                                                                                                                                                                                                                                                                                                                                                                           | 39 of 53 |
| SC SQ valid |              |        | and onto the making suit gets back to follow end vectors will support transfer power without any vector will request fife attached to outstanding if no valid collections Quad Data Valid — Send without new New vector fife with or pc_deallood vector unlet this case the posted in the Quad The Quad The Quad | of pc_dealloc ctor. Receipt of this is to c_dealloc or new_vector y valid quad data. New always be posted to o and pc_dealloc will be o any pixel vector g or posted in request fifo quad outstanding.  ing quad data with or w_vector or pc_dealloc. r will be posted to request without a pixel vector and e will be posted with a pixel ess none is in progress. In the pc_dealloc will be the request queue. s will be transferred with mask set but the pixel ding pixel mask set to |          |
| OC_OQ_valid |              | 1 0    | o sending valid data, Z Cik                                                                                                                                                                                                                                                                                      | Could be all Zeldes                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |          |

#### SC\_SQ\_data – first clock and second clock transfers are shown in the table below.

| Name                           | BitField                      | Bits     | Description                                                          |  |
|--------------------------------|-------------------------------|----------|----------------------------------------------------------------------|--|
|                                |                               |          |                                                                      |  |
| 1 <sup>st</sup> Clock Transfer |                               |          |                                                                      |  |
| SC SQ event                    | 0                             | 1        | This transfer is a 1 clock event vector Force quad mask =            |  |
|                                |                               |          | new vector=pc dealloc=0                                              |  |
| SC SQ event id                 | [4:1]                         | 4        | This field identifies the event 0 => denotes an End Of State Event 1 |  |
|                                |                               |          | => TBD                                                               |  |
| SC SQ pc dealloc               | [7:5]                         | <u>3</u> | Deallocation token for the Parameter Cache                           |  |
| SC SQ new vector               | 8                             | 1        | The SQ must wait for Vertex shader done count > 0 and after          |  |
|                                |                               |          | dispatching the Pixel Vector the SQ will decrement the count.        |  |
| SC SQ quad mask                | [12:9]                        | 4        | Quad Write mask left to right SP0 => SP3                             |  |
| SC SQ end of prim              | <u>13</u>                     | 1        | End Of the primitive                                                 |  |
| SC SQ state id                 | SC SQ state id [16:14] 3      |          | State/constant pointer (6*3+3)                                       |  |
| SC SQ pix mask                 | <u>mask</u> [32:17] <u>16</u> |          | Valid bits for all pixels SP0=>SP3 (UL,UR,LL,LR)                     |  |
| SC SQ provok vtx [37:36] 2     |                               | 2        | Provoking vertex for flat shading                                    |  |
| SC SQ pc ptr0 [48:38]          |                               | 11       | Parameter Cache pointer for vertex 0                                 |  |
|                                |                               |          |                                                                      |  |
| 2nd Clock Transfer             |                               |          |                                                                      |  |
| SC SQ pc ptr1                  | [10:0]                        | 11       | Parameter Cache pointer for vertex 1                                 |  |
| SC SQ pc ptr2                  | [21:11]                       | 11       | Parameter Cache pointer for vertex 2                                 |  |
|                                |                               | 24       | LOD correction per quad (6 bits per quad)                            |  |
| SC SQ prim type                | [48:46]                       | 3        | Stippled line and Real time command need to load tex cords from      |  |
|                                |                               |          | alternate buffer 000: Normal 100: Realtime 101: Line AA 110:         |  |
|                                |                               |          | Point AA (Sprite)                                                    |  |

Exhibit 2030.docR400\_Sequencer.doc 74578 Bytes\*\*\* © ATI Confidential. Reference Copyright Notice on Cover Page © \*\*\*



EDIT DATE

4 September, 201543

R400 Sequencer Specification

PAGE 40 of 53

| Name Bits            |  | Description                                                                   |
|----------------------|--|-------------------------------------------------------------------------------|
| SQ_SC_free_buff 1    |  | Pipelined bit that instructs SC to decrement count of buffers in use.         |
| SQ_SC_dec_cntr_cnt 1 |  | Pipelined bit that instructs SC to decrement count of new vector and/or event |
|                      |  | sent to prevent SC from overflowing SQ interpolator/Reservation request fifo. |

The scan converter will submit a partial vector whenever:

- 1.) He gets a primitive marked with an end of packet signal.
- 2.) A current pixel vector is being assembled with at least one or more valid quads and the vector has been marked for deallocate when a primitive marked new\_vector arrives. The Scan Converter will submit a partial vector (up to 16quads with zero pixel mask to fill out the vector) prior to submitting the new\_vector marker\primitive.

(This will prevent a hang which can be demonstrated when all primitives in a packet three vectors are culled except for a one quad primitive that gets marked pc\_dealloc (vertices maximum size). In this case two new\_vectors are submitted and processed, but then one valid quad with the pc\_dealloc creates a vector and then the new would wait for another vertex vector to be processed, but the one being waited for could never export until the pc\_dealloc signal made it through and thus the hang.)

#### 24.2.3 SQ to SX: Interpolator bus

| Name                       | Direction | Bits | Description                                  |
|----------------------------|-----------|------|----------------------------------------------|
| SQ_SXx_interp_flat_vtx     | SQ→SPx    | 2    | Provoking vertex for flat shading            |
| SQ_SXx_interp_flat_gouraud | SQ→SPx    | 1    | Flat or gouraud shading                      |
| SQ_SXx_interp_cyl_wrap     | SQ→SPx    | 4    | Wich channel needs to be cylindrical wrapped |
| SQ_SXx_pc_ptr0             | SQ→SXx    | 11   | Parameter Cache Pointer                      |
| SQ_SXx_pc_ptr1             | SQ→SXx    | 11   | Parameter Cache Pointer                      |
| SQ_SXx_pc_ptr2             | SQ→SXx    | 11   | Parameter Cache Pointer                      |
| SQ_SXx_rt_sel              | SQ→SXx    | 1    | Selects between RT and Normal data           |
| SQ_SXx_pc_wr_en            | SQ→SXx    | 1    | Write enable for the PC memories             |
| SQ_SXx_pc_wr_addr          | SQ→SXx    | 7    | Write address for the PCs                    |
| SQ_SXx_pc_channel_mask     | SQ→SXx    | 4    | Channel mask                                 |

#### 24.2.4 SQ to SP: Staging Register Data

This is a broadcast bus that sends the VSISR information to the staging registers of the shader pipes.

| Name               | Direction | Bits | Description                                            |
|--------------------|-----------|------|--------------------------------------------------------|
| SQ_SPx_vsr_data    | SQ→SPx    | 96   | Pointers of indexes or HOS surface information         |
| SQ_SPx_vsr_double  | SQ→SPx    | 1    | 0: Normal 96 bits per vert 1: double 192 bits per vert |
| SQ_SP0_ vsr_valid  | SQ→SP0    | 1    | Data is valid                                          |
| SQ_SP1_ vsr_ valid | SQ→SP1    | 1    | Data is valid                                          |
| SQ_SP2_vsr_valid   | SQ→SP2    | 1    | Data is valid                                          |
| SQ_SP3_vsr_valid   | SQ→SP3    | 1    | Data is valid                                          |
| SQ_SPx_vsr_read    | SQ→SPx    | 1    | Increment the read pointers                            |

#### 24.2.5 VGT to SQ: Vertex interface

#### 24.2.5.1 Interface Signal Table

The area difference between the two methods is not sufficient to warrant complicating the interface or the state requirements of the VSISRs. Therefore, the POR for this interface is that the VGT will transmit the data to the VSISRs (via the Shader Sequencer) in full, 32-bit floating-point format. The VGT can transmit up to six 32-bit floating-point values to each VSISR where four or more values require two transmission clocks. The data bus is 96 bits wide.

Exhibit 2030. doc R400\_Sequencer.doc 74578 Bytes\*\*\* © ATI Confidential. Reference Copyright Notice on Cover Page © \*\*\*

|              | ORIGINATE DATE     | EDIT DATE           | DOCUMENT-REV. NUM. |
|--------------|--------------------|---------------------|--------------------|
| <b>6</b> 000 | 24 September, 2001 | 4 September, 201513 | GEN-CXXXXX-REVA    |

| Name                 | Bits | Description                                                                                                                           |
|----------------------|------|---------------------------------------------------------------------------------------------------------------------------------------|
| VGT_SQ_vsisr_data    | 96   | Pointers of indexes or HOS surface information                                                                                        |
| VGT_SQ_vsisr_double  | 1    | 0: Normal 96 bits per vert 1: double 192 bits per vert                                                                                |
| VGT_SQ_end_of_vector | 1    | Indicates the last VSISR data set for the current process vector (for double vector data, "end_of_vector" is set on the first vector) |
| VGT_SQ_indx_valid    | 1    | Vsisr data is valid                                                                                                                   |
| VGT_SQ_state         | 3    | Render State (6*3+3 for constants). This signal is guaranteed to be correct when "VGT_SQ_vgt_end_of_vector" is high.                  |
| VGT_SQ_send          | 1    | Data on the VGT_SQ is valid receive (see write-up for standard R400 SEND/RTR interface handshaking)                                   |
| SQ_VGT_rtr           | 1    | Ready to receive (see write-up for standard R400 SEND/RTR interface handshaking)                                                      |

## 24.2.5.2 Interface Diagrams

PAGE 41 of 53





Figure 1. Detailed Logical Diagram for PA SQ vgt Interface.

Exhibit 2030 doc R400\_Sequences doc 74578 Bytes\*\*\* © ATI Confidential. Reference Copyright Notice on Cover Page © \*\*\*



EDIT DATE
4 September, 201543

R400 Sequencer Specification

PAGE 44 of 53

#### 24.2.6 SQ to SX: Control bus

| Name               | Direction | Bits | Description                                                                                                                                    |
|--------------------|-----------|------|------------------------------------------------------------------------------------------------------------------------------------------------|
| SQ_SXx_exp_type    | SQ→SXx    | 2    | 00: Pixel without z (1 to 4 buffers) 01: Pixel with z (1 to 4 buffers) 10: Position (1 or 2 results) 11: Pass thru (4,8 or 12 results aligned) |
| SQ_SXx_exp_number  | SQ→SXx    | 2    | Number of locations needed in the export buffer (encoding depends on the type see bellow).                                                     |
| SQ_SXx_exp_alu_id  | SQ→SXx    | 1    | ALU ID                                                                                                                                         |
| SQ_SXx_exp_valid   | SQ→SXx    | 1    | Valid bit                                                                                                                                      |
| SQ_SXx_exp_state   | SQ→SXx    | 3    | State Context                                                                                                                                  |
| SQ_SXx_free_done   | SQ→SXx    | 1    | Pulse to indicate that the previous export is finished (this can be sent with or without the other fields of the interface)                    |
| SQ_SXx_free_alu_id | SQ→SXx    | 1    | ALU ID                                                                                                                                         |

Depending on the type the number of export location changes:

- Type 00 : Pixels without Z
  - o 00 = 1 buffer
  - o 01 = 2 buffers
  - $\circ$  10 = 3 buffers
  - 11 = 4 buffer
- Type 01: Pixels with Z
  - o 00 = 2 Buffers (color + Z)
  - o 01 = 3 buffers (2 color + Z)
  - $\circ$  10 = 4 buffers (3 color + Z)
  - o 11 = 5 buffers (4 color + Z)
- Type 10 : Position export
  - o 00 = 1 position
  - o 01 = 2 positions
  - o 1X = Undefined
- Type 11: Pass Thru
  - o 00 = 4 buffers
  - o 01 = 8 buffers
  - o 10 = 12 buffers
  - o 11 = Undefined

Below the thick black line is the end of transfer packet that tells the SX that a given export is finished. The report packet will always arrive either before or at the same time than the next export to the same ALU id.

## 24.2.7 SX to SQ: Output file control

| Name                 | Direction | Bits                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           | Description                                                                                                                                                                                              |
|----------------------|-----------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| SXx_SQ_exp_count_rdy | SXx→SQ    | 1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | Raised by SX0 to indicate that the following two fields                                                                                                                                                  |
|                      |           |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | reflect the result of the most recent export                                                                                                                                                             |
| SXx_SQ_exp_pos_avail | SXx→SQ    | 1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | Specifies whether there is room for another position.                                                                                                                                                    |
| SXx_SQ_exp_buf_avail | SXx→SQ    | 7                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | Specifies the space available in the output buffers.  0: buffers are full  1: 2K-bits available (32-bits for each of the 64 pixels in a clause)  64: 128K-bits available (16 128-bit entries for each of |
|                      |           | and the second s | 64 pixels)<br>65-127: RESERVED                                                                                                                                                                           |

Exhibit 2030. doc R400\_Sequencer.doc 74578 Bytes\*\*\* © ATI Confidential. Reference Copyright Notice on Cover Page © \*\*\*



EDIT DATE
4 September, 201513

DOCUMENT-REV. NUM. GEN-CXXXXX-REVA PAGE 45 of 53

#### 24.2.8 SQ to TP: Control bus

Once every clock, the fetch unit sends to the sequencer on which RS line it is now working and if the data in the GPRs is ready or not. This way the sequencer can update the fetch valid bits flags for the reservation station. The sequencer also provides the instruction and constants for the fetch to execute and the address in the register file where to write the fetch return data.

| Name                   | Direction | Bits | Description                                               |
|------------------------|-----------|------|-----------------------------------------------------------|
| TPx_SQ_data_rdy        | TPx→ SQ   | 1    | Data ready                                                |
| TPx_SQ_rs_line_num     | TPx→ SQ   | 6    | Line number in the Reservation station                    |
| TPx_SQ_type            | TPx→ SQ   | 1    | Type of data sent (0:PIXEL, 1:VERTEX)                     |
| SQ_TPx_send            | SQ→TPx    | 1    | Sending valid data                                        |
| SQ_TPx_const           | SQ→TPx    | 48   | Fetch state sent over 4 clocks (192 bits total)           |
| SQ_TPx_instr           | SQ→TPx    | 24   | Fetch instruction sent over 4 clocks                      |
| SQ_TPx_end_of_group    | SQ→TPx    | 1    | Last instruction of the group                             |
| SQ_TPx_Type            | SQ→TPx    | 1    | Type of data sent (0:PIXEL, 1:VERTEX)                     |
| SQ_TPx_gpr_phase       | SQ→TPx    | 2    | Write phase signal                                        |
| SQ_TP0_lod_correct     | SQ→TP0    | 6    | LOD correct 3 bits per comp 2 components per quad         |
| SQ_TP0_pix_mask        | SQ→TP0    | 4    | Pixel mask 1 bit per pixel                                |
| SQ_TP1_lod_correct     | SQ→TP1    | 6    | LOD correct 3 bits per comp 2 components per quad         |
| SQ_TP1_pix_mask        | SQ→TP1    | 4    | Pixel mask 1 bit per pixel                                |
| SQ_TP2_lod_correct     | SQ→TP2    | 6    | LOD correct 3 bits per comp 2 components per quad         |
| SQ_TP2_pix_mask        | SQ→TP2    | 4    | Pixel mask 1 bit per pixel                                |
| SQ_TP3_lod_correct     | SQ→TP3    | 6    | LOD correct 3 bits per comp 2 components per quad         |
| SQ_TP3_pix_mask        | SQ→TP3    | 4    | Pixel mask 1 bit per pixel                                |
| SQ_TPx_rs_line_num     | SQ→TPx    | 6    | Line number in the Reservation station                    |
| SQ_TPx_write_gpr_index | SQ->TPx   | 7    | Index into Register file for write of returned Fetch Data |

#### 24.2.9 TP to SQ: Texture stall

The TP sends this signal to the SQ and the SPs when its input buffer is full.



| Name              | Direction | Bits | Description                                  |
|-------------------|-----------|------|----------------------------------------------|
| TP_SQ_fetch_stall | TP→ SQ    | 1    | Do not send more texture request if asserted |

Exhibit 2030.docR400\_Sequencer.doc 74578 Bytes\*\*\* © ATI Confidential. Reference Copyright Notice on Cover Page @ \*\*\*



EDIT DATE
4 September, 201543

R400 Sequencer Specification

PAGE 46 of 53

24.2.10 SQ to SP: Texture stall

| Name               | Direction | Bits | Description                                  |
|--------------------|-----------|------|----------------------------------------------|
| SQ_SPx_fetch_stall | SQ→SPx    | 1    | Do not send more texture request if asserted |

### 24.2.11 SQ to SP: GPR and auto counter

| Name                 | Direction | Bits | Description                                                                                                                      |
|----------------------|-----------|------|----------------------------------------------------------------------------------------------------------------------------------|
| SQ_SPx_gpr_wr_addr   | SQ→SPx    | 7    | Write address                                                                                                                    |
| SQ_SPx_gpr_rd_addr   | SQ→SPx    | 7    | Read address                                                                                                                     |
| SQ_SPx_gpr_rd_en     | SQ→SPx    | 1    | Read Enable                                                                                                                      |
| SQ_SP0_gpr_wr_en     | SQ→SPx    | 1    | Write Enable for the GPRs of SP0                                                                                                 |
| SQ_SP1_gpr_wr_en     | SQ→SPx    | 1    | Write Enable for the GPRs of SP1                                                                                                 |
| SQ_SP2_gpr_wr_en     | SQ→SPx    | 1    | Write Enable for the GPRs of SP2                                                                                                 |
| SQ_SP3_gpr_wr_en     | SQ→SPx    | 1    | Write Enable for the GPRs of SP3                                                                                                 |
| SQ_SPx_gpr_phase     | SQ→SPx    | 2    | The phase mux (arbitrates between inputs, ALU SRC reads and writes)                                                              |
| SQ_SPx_channel_mask  | SQ→SPx    | 4    | The channel mask                                                                                                                 |
| SQ_SPx_gpr_input_sel | SQ→SPx    | 2    | When the phase mux selects the inputs this tells from which source to read from: Interpolated data, VTX0, VTX1, autogen counter. |
| SQ_SPx_auto_count    | SQ→SPx    | 12?  | Auto count generated by the SQ, common for all shader pipes                                                                      |



EDIT DATE
4 September, 201543

DOCUMENT-REV. NUM. GEN-CXXXXX-REVA PAGE 47 of 53

### 24.2.12 SQ to SPx: Instructions

| Name               | Direction | Bits | Description                                                                                                                                                     |
|--------------------|-----------|------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------|
| SQ_SPx_instr_start | SQ→SPx    | 1    | Instruction start                                                                                                                                               |
| SQ_SP_instr        | SQ→SPx    | 21   | Transferred over 4 cycles 0: SRC A Select 2:0 SRC A Argument Modifier 3:3 SRC A swizzle 11:4 VectorDst 17:12 Unused 20:18                                       |
|                    |           |      | 1: SRC B Select 2:0 SRC B Argument Modifier 3:3 SRC B swizzle 11:4 ScalarDst 17:12 Unused 20:18                                                                 |
|                    |           |      | 2: SRC C Select 2:0 SRC C Argument Modifier 3:3 SRC C swizzle 11:4 Unused 20:12                                                                                 |
|                    |           |      | - 3: Vector Opcode 4:0 Scalar Opcode 10:5 Vector Clamp 11:11 Scalar Clamp 12:12 Vector Write Mask 16:13 Scalar Write Mask 20:17                                 |
| SQ_SPx_exp_alu_id  | SQ→SPx    | 1    | ALU ID                                                                                                                                                          |
| SQ_SPx_exporting   | SQ→SPx    | 2    | 0: Not Exporting 1: Vector Exporting 2: Scalar Exporting                                                                                                        |
| SQ_SPx_stall       | SQ→SPx    | 1    | Stall signal                                                                                                                                                    |
| SQ_SP0_write_mask  | SQ→SP0    | 4    | Result of pixel kill in the shader pipe, which must be output for all pixel exports (depth and all color buffers). 4x4 because 16 pixels are computed per clock |
| SQ_SP1_ write_mask | SQ→SP1    | 4    | Result of pixel kill in the shader pipe, which must be output for all pixel exports (depth and all color buffers). 4x4 because 16 pixels are computed per clock |
| SQ_SP2_ write_mask | SQ→SP2    | 4    | Result of pixel kill in the shader pipe, which must be output for all pixel exports (depth and all color buffers). 4x4 because 16 pixels are computed per clock |
| SQ_SP3_write_mask  | SQ→SP3    | 4    | Result of pixel kill in the shader pipe, which must be output for all pixel exports (depth and all color buffers). 4x4 because 16 pixels are computed per clock |
|                    |           |      |                                                                                                                                                                 |

### 24.2.13 SP to SQ: Constant address load/ Predicate Set

| Name              | Direction | Bits | Description                                                 |
|-------------------|-----------|------|-------------------------------------------------------------|
| SP0_SQ_const_addr | SP0→SQ    | 36   | Constant address load / predicate vector load (4 bits only) |
|                   |           |      | to the sequencer                                            |
| SP0_SQ_valid      | SP0→SQ    | 1    | Data valid                                                  |
| SP1_SQ_const_addr | SP1→SQ    | 36   | Constant address load / predicate vector load (4 bits only) |

Exhibit 2030.docR400\_Sequencer.doc 74578 Bytes\*\*\* © ATI Confidential. Reference Copyright Notice on Cover Page © \*\*\*



EDIT DATE
4 September, 201543

R400 Sequencer Specification

PAGE 48 of 53

|                   | May 20 | DOOD May C | 2002                                                        |
|-------------------|--------|------------|-------------------------------------------------------------|
|                   |        |            | to the sequencer                                            |
| SP1_SQ_valid      | SP1→SQ | 1          | Data valid                                                  |
| SP2_SQ_const_addr | SP2→SQ | 36         | Constant address load / predicate vector load (4 bits only) |
|                   |        |            | to the sequencer                                            |
| SP2_SQ_valid      | SP2→SQ | 1          | Data valid                                                  |
| SP3_SQ_const_addr | SP3→SQ | 36         | Constant address load / predicate vector load (4 bits only) |
|                   |        |            | to the sequencer                                            |
| SP3_SQ_valid      | SP3→SQ | 1          | Data valid                                                  |
| SP0 SQ data type  | SP→SQ  | 1          | Data Type                                                   |
|                   |        |            | 0: Constant Load                                            |
|                   |        |            | 1: Predicate Set                                            |

#### 24.2.14 SQ to SPx: constant broadcast

| Name         | Direction | Bits | Description        |
|--------------|-----------|------|--------------------|
| SQ_SPx_const | SQ→SPx    | 128  | Constant broadcast |

#### 24.2.15 SP0 to SQ: Kill vector load

| Name             | Direction | Bits | Description      |
|------------------|-----------|------|------------------|
| SP0_SQ_kill_vect | SP0→SQ    | 4    | Kill vector load |
| SP1_SQ_kill_vect | SP1→SQ    | 4    | Kill vector load |
| SP2_SQ_kill_vect | SP2→SQ    | 4    | Kill vector load |
| SP3_SQ_kill_vect | SP3→SQ    | 4    | Kill vector load |

#### 24.2.16 SQ to CP: RBBM bus

| Name           | Direction | Bits | Description          |
|----------------|-----------|------|----------------------|
| SQ_RBB_rs      | SQ→CP     | 1    | Read Strobe          |
| SQ RBB rd      | SQ→CP     | 32   | Read Data            |
| SQ_RBBM_nrtrtr | SQ→CP     | 1    | Optional             |
| SQ_RBBM_rtr    | SQ→CP     | 1    | Real-Time (Optional) |

#### 24.2.17 CP to SQ: RBBM bus

| Name               | Direction | Bits | Description                        |
|--------------------|-----------|------|------------------------------------|
| rbbm_we            | CP→SQ     | 1    | Write Enable                       |
| rbbm_a             | CP→SQ     | 15   | Address Upper Extent is TBD (16:2) |
| rbbm_wd            | CP→SQ     | 32   | Data                               |
| rbbm_be            | CP→SQ     | 4    | Byte Enables                       |
| rbbm_re            | CP→SQ     | 1    | Read Enable                        |
| rbb_rs0            | CP→SQ     | 1    | Read Return Strobe 0               |
| rbb_rs1            | CP→SQ     | 1    | Read Return Strobe 1               |
| rbb_rd0            | CP→SQ     | 32   | Read Data 0                        |
| rbb_rd1            | CP→SQ     | 32   | Read Data 0                        |
| RBBM SQ soft reset | CP→SQ     | 1    | Soft Reset                         |

### 24.2.18 SQ to CP: State report

| Name             | Direction | Bits | Description            |
|------------------|-----------|------|------------------------|
| SQ_CP_vs_event   | SQ→CP     | 1    | Vertex Shader Event    |
| SQ_CP_vs_eventid | SQ→CP     | 2    | Vertex Shader Event ID |
| SQ_CP_ps_event   | SQ→CP     | 1    | Pixel Shader Event     |
| SQ_CP_ps_eventid | SQ→CP     | 2    | Pixel Shader Event ID  |

eventid = 0 => \*sEndOfState (i.e. VsEndOfState) eventid = 1 => \*sDone (i.e. VsDone)

So, the CP will assume the Vs is done with a state whenever it gets a pulse on the SQ\_CP\_vs\_event and the SQ\_CP\_vs\_eventid = 0.

Exhibit 2030 doc R400\_Sequencer.doc 74578 Bytes\*\*\* © ATI Confidential. Reference Copyright Notice on Cover Page © \*\*\*

EDIT DATE
4 September, 201543

DOCUMENT-REV. NUM. GEN-CXXXXX-REVA

PAGE 49 of 53

#### 24.3 Example of control flow program execution

We now provide some examples of execution to better illustrate the new design.

Given the program:

Alu 0

Alu 1

Tex 0

Tex 1

Alu 3 Serial

Alu 4

Tex 2

I EX 2

Alu 5

Alu 6 Serial

Тех 3

Alu 7

Alloc Position 1 buffer

Alu 8 Export

Tex 4

Alloc Parameter 3 buffers

Alu 9 Export 0

Tex 5

Alu 10 Serial Export 2

Alu 11 Export 1 End

Would be converted into the following CF instructions:

Execute Alu 0 Alu 0 Tex 0 Tex 0 Alu 1 Alu 0 Tex 0 Alu 0 Alu 1 Tex 0 Execute Alu 0 Alu 0 Tex 0 Alu 0 Alloc Position 1

Execute Alu 0 Tex 0

Alloc Param 3

Execute Alu 0 Tex 0 Alu 1 Alu 0 End

And the execution of this program would look like this:

Put thread in Vertex RS:

Control Flow Instruction Pointer (12 bits), (CFP) Execution Count Marker (3 or 4 bits), (ECM) Loop Iterators (4x9 bits), (LI)
Call return pointers (4x12 bits), (CRP)
Predicate Bits(4x64 bits), (PB)
Export ID (1 bit), (EXID)
GPR Base Ptr (8 bits), (GPR)

Export Base Ptr (7 bits), (EB)

Context Ptr (3 bits).(CPTR)

LOD correction bits (16x6 bits) (LOD)

| State Bits |     |    |     |    |      |     |    |      |     |
|------------|-----|----|-----|----|------|-----|----|------|-----|
| CFP        | ECM | LI | CRP | РВ | EXID | GPR | EB | CPTR | LOD |
| 0          | 0   | 0  | 0   | 0  | 0    | 0   | 0  | 0    | 0   |

Valid Thread (VALID)

Texture/ALU engine needed (TYPE)

Texture Reads are outstanding (PENDING)

Exhibit 2030 doc R400\_Sequencer.doc 74578 Bytes\*\*\* © ATI Confidential. Reference Copyright Notice on Cover Page © \*\*\*



EDIT DATE
4 September, 201543

R400 Sequencer Specification

PAGE

50 of 53

Waiting on Texture Read to Complete (SERIAL)

Allocation Wait (2 bits) (ALLOC)

00 - No allocation needed

01 - Position export allocation needed (ordered export)

10 - Parameter or pixel export needed (ordered export)

11 - pass thru (out of order export)

Allocation Size (4 bits) (SIZE)

Position Allocated (POS\_ALLOC)

First thread of a new context (FIRST)

Last (1 bit), (LAST)

| Status Bits |      |         |        |       |      |           |       |      |
|-------------|------|---------|--------|-------|------|-----------|-------|------|
| VALID       | TYPE | PENDING | SERIAL | ALLOC | SIZE | POS_ALLOC | FIRST | LAST |
| 1           | ALU  | 0       | 0      | 0     | 0    | 0         | 1     | 0    |

Then the thread is picked up for the execution of the first control flow instruction:

Execute Alu 0 Alu 0 Tex 0 Tex 0 Alu 1 Alu 0 Tex 0 Alu 0 Alu 1 Tex 0

It executes the first two ALU instructions and goes back to the RS for a resource request change. Here is the state returned to the RS:

| State Bit |     |    |     | NV001NP-77-01 |      |     |    |      |     |
|-----------|-----|----|-----|---------------|------|-----|----|------|-----|
| CFP       | ECM | LI | CRP | PB            | EXID | GPR | EB | CPTR | LOD |
| 0         | 2   | 0  | 0   | 0             | 0    | 0   | 0  | 0    | 0   |

| Status Bit | S    |         | *************************************** |       |      |           |       |      |
|------------|------|---------|-----------------------------------------|-------|------|-----------|-------|------|
| VALID      | TYPE | PENDING | SERIAL                                  | ALLOC | SIZE | POS_ALLOC | FIRST | LAST |
| 1          | TEX  | 0       | 0                                       | 0     | 0    | 0         | 1     | 0    |

Then when the texture pipe frees up, the arbiter picks up the thread to issue the texture reads. The thread comes back in this state:

| State Bits | 5   |    |     |    |      |     |    |      |     |
|------------|-----|----|-----|----|------|-----|----|------|-----|
| CFP        | ECM | LI | CRP | PB | EXID | GPR | EB | CPTR | LOD |
| 0          | 4   | 0  | 0   | 0  | 0    | 0   | 0  | 0    | 0   |

| Status Bit | ts   |         |        |       |      |           |       |      |
|------------|------|---------|--------|-------|------|-----------|-------|------|
| VALID      | TYPE | PENDING | SERIAL | ALLOC | SIZE | POS_ALLOC | FIRST | LAST |
| 1          | ALU  | 1       | 1      | 0     | 0    | 0         | 1     | 0    |

Because of the serial bit the arbiter must wait for the texture to return and clear the PENDING bit before it can pick the thread up. Lets say that the texture reads are complete, then the arbiter picks up the thread and returns it in this state:

| State Bits |     |    |     |    |      |     |    |      |     |
|------------|-----|----|-----|----|------|-----|----|------|-----|
| CFP        | ECM | LI | CRP | РВ | EXID | GPR | EB | CPTR | LOD |
| 0          | 6   | 0  | 0   | 0  | 0    | 0   | 0  | 0    | 0   |

| Status Bits |      |         |        |       |      |           |       |      |
|-------------|------|---------|--------|-------|------|-----------|-------|------|
| VALID       | TYPE | PENDING | SERIAL | ALLOC | SIZE | POS_ALLOC | FIRST | LAST |
| 1           | TEX  | 0       | 0      | 0     | 0    | 0         | 1     | 0    |

Again the TP frees up, the arbiter picks up the thread and executes. It returns in this state:

Exhibit 2030.docR400\_Sequencer.doc 74578 Bytes\*\*\* © ATI Confidential. Reference Copyright Notice on Cover Page © \*\*\*



# EDIT DATE 4 September, 201513

| DOCUMENT-REV. NUM. |
|--------------------|
| GEN-CXXXXX-REVA    |

PAGE 51 of 53

| State Bits |     |    |     |    |      |     |    |      |     |
|------------|-----|----|-----|----|------|-----|----|------|-----|
| CFP        | ECM | LI | CRP | РВ | EXID | GPR | EB | CPTR | LOD |
| 0          | 7   | 0  | 0   | 0  | 0    | 0   | 0  | 0    | 0   |

| Status Bits | 3    |         |        |       |      |           |       |      |
|-------------|------|---------|--------|-------|------|-----------|-------|------|
| VALID       | TYPE | PENDING | SERIAL | ALLOC | SIZE | POS_ALLOC | FIRST | LAST |
| 1           | ALU  | 1       | 0      | 0     | 0    | 0         | 1     | 0    |

Now, even if the texture has not returned we can still pick up the thread for ALU execution because the serial bit is not set. The thread will however come back to the RS for the second ALU instruction because it has the serial bit set.

| State Bit | ts  |    |     |    |      |     |    |      |     |
|-----------|-----|----|-----|----|------|-----|----|------|-----|
| CFP       | ECM | LI | CRP | РВ | EXID | GPR | EB | CPTR | LOD |
| 0         | 8   | 0  | 0   | 0  | 0    | 0   | 0  | 0    | 0   |

| Status Bits |      |         |        |       |      |           |       |      |
|-------------|------|---------|--------|-------|------|-----------|-------|------|
| VALID       | TYPE | PENDING | SERIAL | ALLOC | SIZE | POS_ALLOC | FIRST | LAST |
| 1           | ALU  | 1       | 1      | 0     | 0    | 0         | 1     | 0    |

As soon as the TP clears the pending bit the thread is picked up and returns:

| State Bits |     |    |     |    |      |     |    |      |     |
|------------|-----|----|-----|----|------|-----|----|------|-----|
| CFP        | ECM | LI | CRP | РВ | EXID | GPR | EB | CPTR | LOD |
| 0          | 9   | 0  | 0   | 0  | 0    | 0   | 0  | 0    | 0   |

| Status Bits |      |         |        |       |      |           |       |      |
|-------------|------|---------|--------|-------|------|-----------|-------|------|
| VALID       | TYPE | PENDING | SERIAL | ALLOC | SIZE | POS_ALLOC | FIRST | LAST |
| 1           | TEX  | 0       | 0      | 0     | 0    | 0         | 1     | 0    |

Picked up by the TP and returns: Execute Alu 0

| State Bits |     |    |     |    |      |     |    |      |     |
|------------|-----|----|-----|----|------|-----|----|------|-----|
| CFP        | ECM | LI | CRP | PB | EXID | GPR | EB | CPTR | LOD |
| 1          | 0   | 0  | 0   | 0  | 0    | 0   | 0  | 0    | 0   |

| Status Bit | ts   |         |        |       |      |           |       |      |
|------------|------|---------|--------|-------|------|-----------|-------|------|
| VALID      | TYPE | PENDING | SERIAL | ALLOC | SIZE | POS_ALLOC | FIRST | LAST |
| 1          | ALU  | 1       | 0      | 0     | 0    | 0         | 1     | 0    |

Picked up by the ALU and returns (lets say the TP has not returned yet): Alloc Position  ${\bf 1}$ 

| State Bits |     |    |     |    |      |     |    |      |     |
|------------|-----|----|-----|----|------|-----|----|------|-----|
| CFP        | ECM | LI | CRP | РВ | EXID | GPR | EB | CPTR | LOD |
| 2          | 0   | 0  | 0   | 0  | 0    | 0   | 0  | 0    | 0   |

Exhibit 2030.docR400\_Sequencer.doc 74578 Bytes\*\*\* © ATI Confidential. Reference Copyright Notice on Cover Page © \*\*\*



## EDIT DATE 4 September, 201513

R400 Sequencer Specification

PAGE 52 of 53

| Status Bits |      |         |        |       |      |           |       |      |
|-------------|------|---------|--------|-------|------|-----------|-------|------|
| VALID       | TYPE | PENDING | SERIAL | ALLOC | SIZE | POS_ALLOC | FIRST | LAST |
| 1           | ALU  | 1       | 0      | 01    | 1    | 0         | 1     | 0    |

If the SX has the place for the export, the SQ is going to allocate and pick up the thread for execution. It returns to the RS in this state:

Execute Alu 0 Tex 0

| State Bit | S   |    |     |    |      |     |    |      |     |
|-----------|-----|----|-----|----|------|-----|----|------|-----|
| CFP       | ECM | LI | CRP | РВ | EXID | GPR | EB | CPTR | LOD |
| 3         | 1   | 0  | 0   | 0  | 0    | 0   | 0  | 0    | 0   |

| Status Bits | _    |         |        |       |      |           |       |      |
|-------------|------|---------|--------|-------|------|-----------|-------|------|
| VALID       | TYPE | PENDING | SERIAL | ALLOC | SIZE | POS_ALLOC | FIRST | LAST |
| 1           | TEX  | 1       | 0      | 0     | 0    | 1         | 1     | 0    |

Now, since the TP has not returned yet, we must wait for it to return because we cannot issue multiple texture requests. The TP returns, clears the PENDING bit and we proceed:

Alloc Param 3

| State Bits | >   |    |     |    |      |     |    |      |     |
|------------|-----|----|-----|----|------|-----|----|------|-----|
| CFP        | ECM | LI | CRP | РВ | EXID | GPR | EB | CPTR | LOD |
| 4          | 0   | 0  | 0   | 0  | 1    | 0   | 0  | 0    | 0   |

| Status Bits | 5    |         |        |       |      |           |       |      |
|-------------|------|---------|--------|-------|------|-----------|-------|------|
| VALID       | TYPE | PENDING | SERIAL | ALLOC | SIZE | POS_ALLOC | FIRST | LAST |
| 1           | ALU  | 1       | 0      | 10    | 3    | 1         | 1     | 0    |

Once again the SQ makes sure the SX has enough room in the Parameter cache before it can pick up this thread.

Execute Alu 0 Tex 0 Alu 1 Alu 0 End

| State Bits |     |    |     |    |      |     |     |      |     |
|------------|-----|----|-----|----|------|-----|-----|------|-----|
| CFP        | ECM | LI | CRP | РВ | EXID | GPR | ЕВ  | CPTR | LOD |
| 5          | 1   | 0  | 0   | 0  | 1    | 0   | 100 | 0    | 0   |

| Status Bits |      |         |        |       |      |           |       |      |
|-------------|------|---------|--------|-------|------|-----------|-------|------|
| VALID       | TYPE | PENDING | SERIAL | ALLOC | SIZE | POS_ALLOC | FIRST | LAST |
| 1           | TEX  | 1       | 0      | 0     | 0    | 1         | 1     | 0    |

This executes on the TP and then returns:

| State Bits |     |    |     |    |      |     |     |      | *************************************** |
|------------|-----|----|-----|----|------|-----|-----|------|-----------------------------------------|
| CFP        | ECM | LI | CRP | PB | EXID | GPR | EB  | CPTR | LOD                                     |
| 5          | 2   | 0  | 0   | 0  | 1    | 0   | 100 | 0    | 0                                       |

Exhibit 2030.docR400\_Sequencer.doc 74578 Bytes\*\*\* © ATI Confidential. Reference Copyright Notice on Cover Page © \*\*\*



EDIT DATE
4 September, 201543

DOCUMENT-REV. NUM. GEN-CXXXXX-REVA

PAGE 53 of 53

Status Bits

| VALID | TYPE | PENDING | SERIAL | ALLOC | SIZE | POS_ALLOC | FIRST | LAST |
|-------|------|---------|--------|-------|------|-----------|-------|------|
| 1     | ALU  | 1       | 1      | 0     | 0    | 1         | 1     | 1    |

Waits for the TP to return because of the textures reads are pending (and SERIAL in this case). Then executes and does not return to the RS because the LAST bit is set. This is the end of this thread and before dropping it on the floor, the SQ notifies the SX of export completion.

### 25. Open issues

Need to do some testing on the size of the register file as well as on the register file allocation method (dynamic VS static).

Saving power?

Exhibit 2030 docR400\_Sequencer.doc 74578 Bytes\*\*\* © ATI Confidential. Reference Copyright Notice on Cover Page © \*\*\*

|      | 10       |
|------|----------|
| - PA | n        |
|      | IJ       |
|      | Section. |

EDIT DATE 4 September, 201515 DOCUMENT-REV. NUM.
GEN-CXXXXX-REVA

PAGE 1 of 54

Author:

Laurent Lefebvre

| ssue To: | Copy No |
|----------|---------|
| ssue Io: | Copy No |

## **R400 Sequencer Specification**

SQ

Version 2.032

Overview: This is an architectural specification for the R400 Sequencer block (SEQ). It provides an overview of the required capabilities and expected uses of the block. It also describes the block interfaces, internal subblocks, and provides internal state diagrams.

**AUTOMATICALLY UPDATED FIELDS:** 

Document Location: C:\perforce\r400\doc\_lib\design\blocks\sq\R400\_Sequencer.doc

Current Intranet Search Title: R400 Sequencer Specification

| APPRO     | APPROVALS      |  |  |  |  |  |  |
|-----------|----------------|--|--|--|--|--|--|
| Name/Dept | Signature/Date |  |  |  |  |  |  |
|           |                |  |  |  |  |  |  |
|           |                |  |  |  |  |  |  |
|           |                |  |  |  |  |  |  |
|           |                |  |  |  |  |  |  |
|           |                |  |  |  |  |  |  |
|           |                |  |  |  |  |  |  |

Remarks:

THIS DOCUMENT CONTAINS CONFIDENTIAL INFORMATION THAT COULD BE SUBSTANTIALLY DETRIMENTAL TO THE INTEREST OF ATI TECHNOLOGIES INC. THROUGH UNAUTHORIZED USE OR DISCLOSURE.

"Copyright 2001, ATI Technologies Inc. All rights reserved. The material in this document constitutes an unpublished work created in 2001. The use of this copyright notice is intended to provide notice that ATI owns a copyright in this unpublished work. The copyright notice is not an admission that publication has occurred. This work contains confidential, proprietary information and trade secrets of ATI. No part of this document may be used, reproduced, or transmitted in any form or by any means without the prior written permission of ATI Technologies Inc."

Exhibit 2031.docR400\_Sequencer.doc 71818 Bytes\*\*\* © ATI Confidential. Reference Copyright Notice on Cover Page © \*\*\*

ATI 2031 LG v. ATI IPR2015-00325



EDIT DATE
4 September, 201545

R400 Sequencer Specification

PAGE 2 of 54

# Table Of Contents

| 1           | OVERVIEW                                                 |      |
|-------------|----------------------------------------------------------|------|
| 1.1         | Top Level Block Diagram                                  | 119  |
| 1.2         | Data Flow graph (SP)                                     | 1210 |
| 1.3         | Control Graph                                            | 1311 |
| 2.          | INTERPOLATED DATA BUS                                    | 1344 |
| 3.          | INSTRUCTION STORE                                        | 1614 |
| 4.          | SEQUENCER INSTRUCTIONS                                   |      |
| 5.          | CONSTANT STORES                                          | 1614 |
| 5.1         | Memory organizations                                     | 1614 |
| 5.2         | Management of the Control Flow Constants                 | 1745 |
| 5.3         | Management of the re-mapping tables                      | 1745 |
| 5.3         | 3.1 R400 Constant management                             | 1745 |
| <u>5.1</u>  | 3.2 Proposal for R400LE constant management              | 1745 |
| 5.3         | 3.3 Dirty bits                                           | 1947 |
| 5.3         | 3.4 Free List Block                                      | 1917 |
| 5.3         | 3.5 De-allocate Block                                    | 2048 |
| 5.3         | 3.6 Operation of Incremental model                       | 2048 |
| 5.4         | Constant Store Indexing.                                 | 2048 |
| 5.5         | Real Time Commands                                       | 2149 |
| 5.6         | Constant Waterfalling                                    | 2119 |
| 6.          | LOOPING AND BRANCHES                                     | 2220 |
| 6.1         | The controlling state.                                   |      |
| 6.2         | The Control Flow Program                                 | 2220 |
|             | 2.1 Control flow instructions table                      |      |
| 6.3         | Implementation                                           | 2523 |
| 6.4         | Data dependant predicate instructions                    | 2624 |
| 6.5         | HW Detection of PV,PS                                    | 2725 |
| 6.6         | Register file indexing                                   | 2725 |
| 6.7         | Debugging the Shaders                                    |      |
| *********** | 7.1 Method 1: Debugging registers                        |      |
| 6.          | 7.2 Method 2: Exporting the values in the GPRs           | 2826 |
| 7.          | PIXEL KILL MASK                                          | 2826 |
| 8.          | MULTIPASS VERTEX SHADERS (HOS)                           | 2826 |
| 9.          | REGISTER FILE ALLOCATION                                 | 2926 |
| 10.         | FETCH ARBITRATION                                        | 3028 |
| 11.         | ALU ARBITRATION                                          | 3028 |
| 12.         | HANDLING STALLS                                          | 3129 |
| 13.         | CONTENT OF THE RESERVATION STATION FIFOS THE OUTPUT FILE | 3129 |
| 14.         | THE OUTPUT FILE                                          | 3129 |
| 15.         | IJ FORMAT                                                | 3129 |
| 15.1        | Interpolation of constant attributes                     | 3229 |
| 16.         | STAGING REGISTERS                                        | 3230 |

|                                         | ORIGINATE DATE            | EDIT DATE             | DOCUMENT-REV. NUM.                       | PAGE                                    |  |  |  |  |  |  |
|-----------------------------------------|---------------------------|-----------------------|------------------------------------------|-----------------------------------------|--|--|--|--|--|--|
| <b>6</b> -00 0                          | 24 September, 2001        | 4 September, 201515   | GEN-CXXXXX-REVA                          | 3 of 54                                 |  |  |  |  |  |  |
| 17. THE PARAMETER CACHE                 |                           |                       |                                          |                                         |  |  |  |  |  |  |
| 17.1 Export restrictions 3432           |                           |                       |                                          |                                         |  |  |  |  |  |  |
| 17.1.1                                  | Pixel exports:            |                       |                                          | 343 <u>2</u>                            |  |  |  |  |  |  |
| <u>17.1.2</u> Vertex exports: 3432      |                           |                       |                                          |                                         |  |  |  |  |  |  |
| 17.1.3 Pass thru exports: 3432          |                           |                       |                                          |                                         |  |  |  |  |  |  |
| 17.2 Arbitration restrictions 3432      |                           |                       |                                          |                                         |  |  |  |  |  |  |
|                                         |                           |                       | ***************************************  |                                         |  |  |  |  |  |  |
|                                         |                           |                       |                                          |                                         |  |  |  |  |  |  |
| 19. SPEC                                | IAI INTERPOLATION         | MODES                 |                                          | 3533                                    |  |  |  |  |  |  |
| 19.1 Real                               | time commands             |                       |                                          | 3533                                    |  |  |  |  |  |  |
| 19.2 Sprit                              | es/ XY screen coordinates | ates/ FB information  |                                          | 36 <del>33</del>                        |  |  |  |  |  |  |
|                                         |                           |                       |                                          |                                         |  |  |  |  |  |  |
| 19.3.1                                  | Vertex shaders            |                       |                                          | 36 <u>34</u>                            |  |  |  |  |  |  |
| 19.3.2                                  | Pixel shaders             |                       |                                          | 36 <u>34</u>                            |  |  |  |  |  |  |
| 20. STATI                               | E MANAGEMENT              | ********************* |                                          | 3734                                    |  |  |  |  |  |  |
|                                         |                           |                       | 433444433444344444444444444444444444444  |                                         |  |  |  |  |  |  |
|                                         |                           |                       | ***************                          |                                         |  |  |  |  |  |  |
|                                         |                           |                       |                                          |                                         |  |  |  |  |  |  |
|                                         |                           |                       |                                          |                                         |  |  |  |  |  |  |
|                                         |                           |                       |                                          |                                         |  |  |  |  |  |  |
| 23. DEBU                                | G REGISTERS               |                       | ***************************************  | 3935                                    |  |  |  |  |  |  |
|                                         |                           |                       |                                          |                                         |  |  |  |  |  |  |
|                                         |                           |                       |                                          |                                         |  |  |  |  |  |  |
| *************************************** |                           |                       | ***************************************  |                                         |  |  |  |  |  |  |
|                                         |                           |                       |                                          |                                         |  |  |  |  |  |  |
| *************************************** |                           |                       |                                          |                                         |  |  |  |  |  |  |
| <u>24.2.1</u>                           | SC SP#                    |                       |                                          | <u> 3935</u>                            |  |  |  |  |  |  |
| <u>24.2.2</u>                           | SC SQ                     |                       |                                          | 4036                                    |  |  |  |  |  |  |
| 24.2.3                                  | SQ to SX: Interpolato     | r bus                 |                                          | 4238                                    |  |  |  |  |  |  |
| 24.2.4                                  | SQ to SP: Staging Re      | gister Data           |                                          | 4238                                    |  |  |  |  |  |  |
| 24.2.5                                  | VGT to SQ : Vertex in     | terface               | ***************************************  | 4238                                    |  |  |  |  |  |  |
| 24.2.6                                  | SQ to SX: Control bus     | š                     |                                          | 4541                                    |  |  |  |  |  |  |
| 24.2.7                                  |                           |                       |                                          |                                         |  |  |  |  |  |  |
| 24.2.8                                  |                           |                       |                                          | *************************************** |  |  |  |  |  |  |
| 24.2.9                                  |                           |                       |                                          |                                         |  |  |  |  |  |  |
| 24.2.10                                 |                           |                       |                                          |                                         |  |  |  |  |  |  |
| 24.2.11                                 |                           |                       |                                          |                                         |  |  |  |  |  |  |
|                                         |                           |                       |                                          |                                         |  |  |  |  |  |  |
| 24.2.12                                 |                           |                       | 0.1                                      |                                         |  |  |  |  |  |  |
| 24.2.13                                 |                           |                       | Set                                      |                                         |  |  |  |  |  |  |
| 24.2.14                                 | SQ to SPx: constant t     | oroadcast             | 3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3.3. | 4945                                    |  |  |  |  |  |  |





| 20                                             | ORIGINATE DATE       | EDIT DATE             | R400 Sequencer Specification            | PAGE    |  |  |  |  |  |
|------------------------------------------------|----------------------|-----------------------|-----------------------------------------|---------|--|--|--|--|--|
| <br><b>~</b> ,000                              | 24 September, 2001   | 4 September, 201515   |                                         | 6 of 54 |  |  |  |  |  |
| 24.2.13                                        | SP to SQ: Constant   | address load/ Predica | ate-Set                                 | 45      |  |  |  |  |  |
| 24.2.14                                        | SQ to SPx: constant  | t-broadcast           |                                         | 46      |  |  |  |  |  |
| 24.2.15                                        | SP0 to SQ: Kill vect | or load               |                                         | 46      |  |  |  |  |  |
| 24.2.16-                                       | SQ to CP: RBBM bu    | ls                    |                                         | 46      |  |  |  |  |  |
| 24.2.17                                        | CP to SQ: RBBM bu    | JS                    |                                         | 46      |  |  |  |  |  |
| 24.2.18 SQ to CP: State report                 |                      |                       |                                         |         |  |  |  |  |  |
| 24.3 Example of control flow program execution |                      |                       |                                         |         |  |  |  |  |  |
| 25. OPEN                                       | HSSUES               | **********            | *************************************** | 51      |  |  |  |  |  |



4 September, 201545

DOCUMENT-REV. NUM. GEN-CXXXXX-REVA

PAGE 7 of 54

## **Revision Changes:**

Rev 0.1 (Laurent Lefebvre) Date: May 7, 2001

Rev 0.2 (Laurent Lefebvre) Date: July 9, 2001 Rev 0.3 (Laurent Lefebvre) Date: August 6, 2001 Rev 0.4 (Laurent Lefebvre) Date: August 24, 2001

Rev 0.5 (Laurent Lefebvre) Date: September 7, 2001 Rev 0.6 (Laurent Lefebvre) Date: September 24, 2001 Rev 0.7 (Laurent Lefebvre) Date: October 5, 2001

Rev 0.8 (Laurent Lefebvre) Date: October 8, 2001 Rev 0.9 (Laurent Lefebvre) Date: October 17, 2001

Rev 1.0 (Laurent Lefebvre) Date: October 19, 2001 Rev 1.1 (Laurent Lefebvre) Date: October 26, 2001

Rev 1.2 (Laurent Lefebvre) Date: November 16, 2001 Rev 1.3 (Laurent Lefebvre) Date: November 26, 2001 Rev 1.4 (Laurent Lefebvre) Date: December 6, 2001

Rev 1.5 (Laurent Lefebvre) Date: December 11, 2001

Rev 1.6 (Laurent Lefebvre) Date: January 7, 2002

Rev 1.7 (Laurent Lefebvre) Date: February 4, 2002 Rev 1.8 (Laurent Lefebvre) Date: March 4, 2002

Rev 1.9 (Laurent Lefebvre) Date: March 18, 2002 Rev 1.10 (Laurent Lefebvre) Date: March 25, 2002 Rev 1.11 (Laurent Lefebvre) Date: April 19, 2002 Rev 2.0 (Laurent Lefebvre) Date: April 19, 2002 First draft.

Changed the interfaces to reflect the changes in the SP. Added some details in the arbitration section. Reviewed the Sequencer spec after the meeting on August 3, 2001.

Added the dynamic allocation method for register file and an example (written in part by Vic) of the flow of pixels/vertices in the sequencer.

Added timing diagrams (Vic)

Changed the spec to reflect the new R400 architecture. Added interfaces.

Added constant store management, instruction store management, control flow management and data dependent predication.

Changed the control flow method to be more flexible. Also updated the external interfaces.

Incorporated changes made in the 10/18/01 control flow meeting. Added a NOP instruction, removed the conditional\_execute\_or\_jump. Added debug registers.

Refined interfaces to RB. Added state registers.

Added SEQ $\rightarrow$ SP0 interfaces. Changed delta precision. Changed VGT $\rightarrow$ SP0 interface. Debug Methods added.

Interfaces greatly refined. Cleaned up the spec.

Added the different interpolation modes.

Added the auto incrementing counters. Changed the VGT $\rightarrow$ SQ interface. Added content on constant management. Updated GPRs.

Removed from the spec all interfaces that weren't directly tied to the SQ. Added explanations on constant management. Added PA $\rightarrow$ SQ synchronization fields and explanation.

Added more details on the staging register. Added detail about the parameter caches. Changed the call instruction to a Conditionnal\_call instruction. Added details on constant management and updated the diagram.

Added Real Time parameter control in the SX interface. Updated the control flow section.

New interfaces to the SX block. Added the end of clause modifier, removed the end of clause instructions.

Rearangement of the CF instruction bits in order to ensure byte alignement.

Updated the interfaces and added a section on exporting rules.

Added CP state report interface. Last version of the spec with the old control flow scheme

New control flow scheme



EDIT DATE

4 September, 201515

R400 Sequencer Specification

PAGE 8 of 54

Rev 2.01 (Laurent Lefebvre) Date: May 2, 2002 Rev 2.02 (Laurent Lefebvre) Date: May 13, 2002

Rev 2.03 (Laurent Lefebvre) Date: July 15, 2002 Changed slightly the control flow instructions to allow force jumps and calls.

Updated the Opcodes. Added type field to the constant/pred interface. Added Last field to the SQ→SP instruction load interface.

SP interface updated to include predication optimizations. Added the predicate no stall instructions,



4 September, 201545

DOCUMENT-REV. NUM. GEN-CXXXXX-REVA

PAGE 9 of 54

#### 1. Overview

The sequencer chooses two ALU threads and a fetch hread to execute, and executes all of the instructions in a block before looking for a new clause of the same type. Two ALU threads are executed interleaved to hide the ALU latency. The arbitrator will give priority to older threads. There are two separate reservation stations, one for pixel vectors and one for vertices vectors. This way a pixel can pass a vertex and a vertex can pass a pixel.

To support the shader pipe the sequencer also contains the shader instruction cache, constant store, control flow constants and texture state. The four shader pipes also execute the same instruction thus there is only one sequencer for the whole chip.

The sequencer first arbitrates between vectors of 64 vertices that arrive directly from primitive assembly and vectors of 16 quads (64 pixels) that are generated in the scan converter.

The vertex or pixel program specifies how many GPRs it needs to execute. The sequencer will not start the next vector until the needed space is available in the GPRs.



Exhibit 2031 doc R400\_Sequencer-doc 71818 Bytes\*\*\* © ATI Confidential. Reference Copyright Notice on Cover Page © \*\*\*

AMD1044\_0257511



4 September, 201545

DOCUMENT-REV. NUM.
GEN-CXXXXX-REVA

PAGE 11 of 54

# 1.1 Top Level Block Diagram



Figure 2: Reservation stations and arbiters

Under this new scheme, the sequencer (SQ) will only use one global state management machine per vector type (pixel, vertex) that we call the reservation station (RS).





4 September, 201515

DOCUMENT-REV. NUM. GEN-CXXXXX-REVA PAGE 13 of 54

The gray area represents blocks that are replicated 4 times per shader pipe (16 times on the overall chip).

# 1.3 Control Graph



Figure 4: Sequencer Control interfaces

In green is represented the Fetch control interface, in red the ALU control interface, in blue the Interpolated/Vector control interface and in purple is the output file control interface.

# 2. Interpolated data bus

The interpolators contain an IJ buffer to pack the information as much as possible before writing it to the register file.



Figure 6: Interpolation timing diagram

AMD1044\_0257516



**EDIT DATE** 

R400 Sequencer Specification

PAGE 16 of 54

24 September, 2001

4 September, 201515

Above is an example of a tile the sequencer might receive from the SC. The write side is how the data get stacked into the XY and IJ buffers, the read side is how the data is passed to the GPRs. The IJ information is packed in the IJ buffer 4 quads at a time or two clocks. The sequencer allows at any given time as many as four quads to interpolate a parameter. They all have to come from the same primitive. Then the sequencer controls the write mask to the GPRs

# 3. Instruction Store

to write the valid data in.

There is going to be only one instruction store for the whole chip. It will contain 4096 instructions of 96 bits each.

It is likely to be a 1 port memory; we use 1 clock to load the ALU instruction, 1 clocks to load the Fetch instruction, 1 clock to load 2 control flow instructions and 1 clock to write instructions.

The instruction store is loaded by the CP thru the register mapped registers.

The VS\_BASE and PS\_BASE context registers are used to specify for each context where its shader is in the instruction memory.

For the Real time commands the story is quite the same but for some small differences. There are no wrap-around points for real time so the driver must be careful not to overwrite regular shader data. The shared code (shared subroutines) uses the same path as real time.

#### 4. Sequencer Instructions

All control flow instructions and move instructions are handled by the sequencer only. The ALUs will perform NOPs during this time (MOV PV,PV, PS,PS) if they have nothing else to do.

# 5. Constant Stores

# 5.1 Memory organizations

A likely size for the ALU constant store is 1024x128 bits. The read BW from the ALU constant store is 128 bits/clock and the write bandwidth is 32 bits/clock (directed by the CP bus size not by memory ports).

The maximum logical size of the constant store for a given shader is 256 constants. Or 512 for the pixel/vertex shader pair. The size of the re-mapping table is 128 lines (each line addresses 4 constants). The write granularity is 4 constants or 512 bits. It takes 16 clocks to write the four constants. Real time requires 256 lines in the physical memory (this is physically register mapped).

The texture state is also kept in a similar memory. The size of this memory is 320x96 bits (128 texture states for regular mode, 32 states for RT). The memory thus holds 128 texture states (192 bits per state). The logical size exposes 32 different states total, which are going to be shared between the pixel and the vertex shader. The size of the re-mapping table to for the texture state memory is 32 lines (each line addresses 1 texture state lines in the real memory). The CP write granularity is 1 texture state lines (or 192 bits). The driver sends 512 bits but the CP ignores the top 320 bits. It thus takes 6 clocks to write the texture state. Real time requires 32 lines in the physical memory (this is physically register mapped).

The control flow constant memory doesn't sit behind a renaming table. It is register mapped and thus the driver must reload its content each time there is a change in the control flow constants. Its size is 320\*32 because it must hold 8 copies of the 32 dwords of control flow constants and the loop construct constants must be aligned.

The constant re-mapping tables for texture state and ALU constants are logically register mapped for regular mode and physically register mapped for RT operation.



EDIT DATE

4 September, 201545

DOCUMENT-REV. NUM. GEN-CXXXXX-REVA PAGE 17 of 54

# 5.2 Management of the Control Flow Constants

The control flow constants are register mapped, thus the CP writes to the according register to set the constant, the SQ decodes the address and writes to the block pointed by its current base pointer (CF\_WR\_BASE). On the read side, one level of indirection is used. A register (SQ\_CONTEXT\_MISC.CF\_RD\_BASE) keeps the current base pointer to the control flow block. This register is copied whenever there is a state change. Should the CP write to CF after the state change, the base register is updated with the (current pointer number +1)% number of states. This way, if the CP doesn't write to CF the state is going to use the previous CF constants.

#### 5.3 Management of the re-mapping tables

#### 5.3.1 R400 Constant management

The sequencer is responsible to manage two re-mapping tables (one for the constant store and one for the texture state). On a state change (by the driver), the sequencer will broadside copy the contents of its re-mapping tables to a new one. We have 8 different re-mapping tables we can use concurrently.

The constant memory update will be incremental, the driver only need to update the constants that actually changed between the two state changes.

For this model to work in its simplest form, the requirement is that the physical memory MUST be at least twice as large as the logical address space + the space allocated for Real Time. In our case, since the logical address space is 512 and the reserved RT space can be up to 256 entries, the memory must be of sizes 1280 and above. Similarly the size of the texture store must be of 32\*2+32 = 96 entries and above.

#### 5.3.2 Proposal for R400LE constant management

To make this scheme work with only 512+256 = 768 entries, upon reception of a CONTROL packet of state + 1, the sequencer would check for SQ\_IDLE and PA\_IDLE and if both are idle will erase the content of state to replace it with the new state (this is depicted in Figure 8: De-allocation mechanismFigure 8: De-allocation mechanismFigure 8: De-allocation mechanism). Note that in the case a state is cleared a value of 0 is written to the corresponding de-allocation counter location so that when the SQ is going to report a state change, nothing will be de-allocated upon the first report.

The second path sets all context dirty bits that were used in the current state to 1 (thus allowing the new state to reuse these physical addresses if needed).





4 September, 201515

DOCUMENT-REV. NUM.
GEN-CXXXXX-REVA

PAGE 19 of 54



Figure 8: De-allocation mechanism for R400LE

#### 5.3.3 Dirty bits

Two sets of dirty bits will be maintained per logical address. The first one will be set to zero on reset and set when the logical address is addressed. The second one will be set to zero whenever a new context is written and set for each address written while in this context. The reset dirty is not set, then writing to that logical address will not require de-allocation of whatever address stored in the renaming table. If it is set and the context dirty is not set, then the physical address store needs to be de-allocated and a new physical address is necessary to store the incoming data. If they are both set, then the data will be written into the physical address held in the renaming for the current logical address. No de-allocation or allocation takes place. This will happen when the driver does a set constant twice to the same logical address between context changes. NOTE: It is important to detect and prevent this, failure to do it will allow multiple writes to allocate all physical memory and thus hang because a context will not fit for rendering to start and thus free up space.

#### 5.3.4 Free List Block

A free list block that would consist of a counter (called the IFC or Initial Free Counter) that would reset to zero and incremented every time a chunk of physical memory is used until they have all been used once. This counter would be checked each time a physical block is needed, and if the original ones have not been used up, us a new one, else check the free list for an available physical block address. The count is the physical address for when getting a chunk from the counter.

Storage of a free list big enough to store all physical block addresses.

Maintain three pointers for the free list that are reset to zero. The first one we will call write\_ptr. This pointer will identify the next location to write the physical address of a block to be de-allocated. Note: we can never free more physical memory locations than we have. Once recording address the pointer will be incremented to walk the free list like a ring.

The second pointer will be called stop\_ptr. The stop\_ptr pointer will be advanced by the number of address chunks de-allocates when a context finishes. The address between the stop\_ptr and write\_ptr cannot be reused because they are still in use. But as soon as the context using then is dismissed the stop\_ptr will be advanced.

The third pointer will be called read\_ptr. This pointer will point will point to the next address that can be used for allocation as long as the read\_ptr does not equal the stop\_ptr and the IFC is at its maximum count.



24 September, 2001

EDIT DATE

R400 Sequencer Specification

PAGE 20 of 54

4 September, 201515

# 5.3.5 De-allocate Block

This block will maintain a free physical address block count for each context. While in current context, a count shall be maintained specifying how many blocks were written into the free list at the write\_ptr pointer. This count will be reset upon reset or when this context is active on the back and different than the previous context. It is actually a count of blocks in the previous context that will no longer be used. This count will be used to advance the write\_ptr pointer to make available the set of physical blocks freed when the previous context was done. This allows the discard or de-allocation of any number of blocks in one clock.

#### 5.3.6 Operation of Incremental model

The basic operation of the model would start with the write\_ptr, stop\_ptr, read\_ptr pointers in the free list set to zero and the free list counter is set to zero. Also all the dirty bits and the previous context will be initialized to zero. When the first set constants happen, the reset dirty bit will not be set, so we will allocate a physical location from the free list counter because its not at the max value. The data will be written into physical address zero. Both the additional copy of the renaming table and the context zeros of the big renaming table will be updated for the logical address that was written by set start with physical address of 0. This process will be repeated for any logical address that are not dirty until the context changes. If a logical address is hit that has its dirty bits set while in the same context, both dirty bits would be set, so the new data will be over-written to the last physical address assigned for this logical address. When the first draw command of the context is detected, the previous context stored in the additional renaming table will be copied to the larger renaming table in the current (new) context location. Then the set constant logical address with be loaded with a new physical address during the copy and if the reset dirty was set, the physical address it replaced in the renaming table would be entered at the write\_ptr pointer location on the free list and the write\_ptr will be incremented. The de-allocation counter for the previous context (eight) will be incremented. This as set states come in for this context one of the following will happen:

- No dirty bits are set for the logical address being updated. A line will be allocated of the free-list counter or the free list at read\_ptr pointer if read\_ptr != to stop\_ptr.
- Reset dirty set and Context dirty not set. A new physical address is allocated, the physical address in the
  renaming table is put on the free list at write\_ptr and it is incremented along with the de-allocate counter for
  the last context
- 3.) Context dirty is set then the data will be written into the physical address specified by the logical address.

This process will continue as long as set states arrive. This block will provide backpressure to the CP whenever he has not free list entries available (counter at max and stop\_ptr == read\_ptr). The command stream will keep a count of contexts of constants in use and prevent more than max constants contexts from being sent.

Whenever a draw packet arrives, the content of the re-mapping table is written to the correct re-mapping table for the context number. Also if the next context uses less constants than the current one all exceeding lines are moved to the free list to be de-allocated later. This happens in parallel with the writing of the re-mapping table to the correct memory.

Now preferable when the constant context leaves the last ALU clause it will be sent to this block and compared with the previous context that left. (Init to zero) If they differ than the older context will no longer be referenced and thus can be de-allocated in the physical memory. This is accomplished by adding the number of blocks freed this context to the stop\_ptr pointer. This will make all the physical addresses used by this context available to the read\_ptr allocate pointer for future allocation.

This device allows representation of multiple contexts of constants data with N copies of the logical address space. It also allows the second context to be represented as the first set plus some new additional data by just storing the delta's. It allows memory to be efficiently used and when the constants updates are small it can store multiple context. However, if the updates are large, less contexts will be stored and potentially performance will be degraded. Although it will still perform as well as a ring could in this case.

## 5.4 Constant Store Indexing

In order to do constant store indexing, the sequencer must be loaded first with the indexes (that come from the GPRs). There are 144 wires from the exit of the SP to the sequencer (9 bits pointers x 16 vertexes/clock). Since the data must pass thru the Shader pipe for the float to fixed conversion, there is a latency of 4 clocks (1 instruction)



EDIT DATE

4 September, 201545

DOCUMENT-REV. NUM. GEN-CXXXXX-REVA PAGE 21 of 54

between the time the sequencer is loaded and the time one can index into the constant store. The assembly will look like this

MOVA R1.X,R2.X // Loads the sequencer with the content of R2.X, also copies the content of R2.X into R1.X NOP // latency of the float to fixed conversion

ADD R3,R4,C0[R2.X]// Uses the state from the sequencer to add R4 to C0[R2.X] into R3

Note that we don't really care about what is in the brackets because we use the state from the MOVA instruction. R2.X is just written again for the sake of simplicity and coherency.

The storage needed in the sequencer in order to support this feature is 2\*64\*9 bits = 1152 bits.

#### 5.5 Real Time Commands

The real time commands constants are written by the CP using the register mapped registers allocated for RT. It works is the same way than when dealing with regular constant loads BUT in this case the CP is not sending a logical address but rather a physical address and the reads are not passing thru the re-mapping table but are directly read from the memory. The boundary between the two zones is defined by the CONST\_EO\_RT control register. Similarly, for the fetch state, the boundary between the two zones is defined by the TSTATE\_EO\_RT control register.

# 5.6 Constant Waterfalling

In order to have a reasonable performance in the case of constant store indexing using the address register, we are going to have the possibility of using the physical memory port for read only. This way we can read 1 constant per clock and thus have a worst-case waterfall mode of 1 vertex per clock. There is a small synchronization issue related with this as we need for the SQ to make sure that the constants where actually written to memory (not only sent to the sequencer) before it can allow the first vector of pixels or vertices of the state to go thru the ALUs. To do so, the sequencer keeps 8 bits (one per render state) and sets the bits whenever the last render state is written to memory and clears the bit whenever a state is freed.



Figure 9: The instruction-Constant store



EDIT DATE

R400 Sequencer Specification

PAGE 22 of 54

24 September, 2001 4 September, 201545

#### 6. Looping and Branches

Loops and branches are planned to be supported and will have to be dealt with at the sequencer level. We plan on supporting constant loops and branches using a control program.

## 6.1 The controlling state.

The R400 controling state consists of:

Boolean[256:0] Loop\_count[7:0][31:0] Loop\_Start[7:0][31:0] Loop\_Step[7:0][31:0]

That is 256 Booleans and 32 loops.

We have a stack of 4 elements for nested calls of subroutines and 4 loop counters to allow for nested loops.

This state is available on a per shader program basis.

# 6.2 The Control Flow Program

We'd like to be able to code up a program of the form:

1: Loop
2: Exec TexFetch
3: TexFetch
4: ALU
5: ALU
6: TexFetch
7: End Loop
8: ALU Export

But realize that 3: may be dependent on 2: and 4: is almost certainly dependent on 2: and 3:. Without clausing, these dependencies need to be expressed in the Control Flow instructions. Additionally, without separate 'texture clauses' and 'ALU clauses' we need to know which instructions to dispatch to the Texture Unit and which to the ALU unit. This information will be encapsulated in the flow control instructions.

Each control flow instruction will contain 2 bits of information for each (non-control flow) instruction:

- a) ALU or Texture
- b) Serialize Execution

(b) would force the thread to stop execution at this point (before the instruction is executed) and wait until all textures have been fetched. Given the allocation of reserved bits, this would mean that the count of an 'Exec' instruction would be limited to about 8 (non-control-flow) instructions. If more than this were needed, a second Exec (with the same conditions) would be issued.

Another function that relies upon 'clauses' is allocation and order of execution. We need to assure that pixels and vertices are exported in the correct order (even if not all execution is ordered) and that space in the output buffers are allocated in order. Additionally data can't be exported until space is allocated. A new control flow instruction:

Alloc <buffer select -- position,parameter, pixel or vertex memory. And the size required>.

would be created to mark where such allocation needs to be done. To assure allocation is done in order, the actual allocation for a given thread can not be performed unless the equivalent allocation for all previous threads is already completed. The implementation would also assure that execution of instruction(s) following the serialization due to the Alloc will occur in order — at least until the next serialization or change from ALU to Texture. In most cases this will allow the exports to occur without any further synchronization. Only 'final' allocations or position allocations are



EDIT DATE

4 September, 201515

DOCUMENT-REV. NUM. GEN-CXXXXX-REVA PAGE 23 of 54

guaranteed to be ordered. Because strict ordering is required for pixels, parameters and positions, this implies only a single alloc for these structures. Vertex exports to memory do not require ordering during allocation and so multiple 'allocs' may be done.

#### 6.2.1 Control flow instructions table

Here is the revised control flow instruction set.

Note that whenever a field is marked as RESERVED, it is assumed that all the bits of the field are cleared (0).

|                   | NOP        |          |   |  |  |  |  |  |  |  |
|-------------------|------------|----------|---|--|--|--|--|--|--|--|
| 47 44   43   42 0 |            |          |   |  |  |  |  |  |  |  |
| 0000              | Addressing | RESERVED | 1 |  |  |  |  |  |  |  |

#### This is a regular NOP.

|       | Execute    |          |                                  |       |              |  |  |  |  |  |  |
|-------|------------|----------|----------------------------------|-------|--------------|--|--|--|--|--|--|
| 47 44 | 43         | 40 34    | 3316                             | 1512  | 11 0         |  |  |  |  |  |  |
| 0001  | Addressing | RESERVED | Instructions type + serialize (9 | Count | Exec Address |  |  |  |  |  |  |
|       |            |          | instructions)                    |       |              |  |  |  |  |  |  |

|       | Execute_End |          |                                  |       |              |  |  |  |  |  |  |
|-------|-------------|----------|----------------------------------|-------|--------------|--|--|--|--|--|--|
| 47 44 |             |          | 3316                             | 1512  | 11 0         |  |  |  |  |  |  |
| 0010  | Addressing  | RESERVED | Instructions type + serialize (9 | Count | Exec Address |  |  |  |  |  |  |
|       |             |          | instructions)                    |       |              |  |  |  |  |  |  |

Execute up to 9 instructions at the specified address in the instruction memory. The Instruction type field tells the sequencer the type of the instruction (LSB) (1 = Texture, 0 = ALU and whether to serialize or not the execution (MSB) (1 = Serialize, 0 = Non-Serialized). If Execute\_End this is the last execution block of the shader program.

| 11 0        |
|-------------|
| xec Address |
|             |

| Conditional_Execute_End |                                |    |         |                                  |       |              |  |  |  |  |
|-------------------------|--------------------------------|----|---------|----------------------------------|-------|--------------|--|--|--|--|
| 47 44                   | 43                             | 42 | 41 34   | 3316                             | 15 12 | 11 0         |  |  |  |  |
| 0100                    | 0100 Addressing Condition Book |    | Boolean | Instructions type + serialize (9 | Count | Exec Address |  |  |  |  |
|                         |                                |    | addrage | instructions)                    |       |              |  |  |  |  |

If the specified Boolean (8 bits can address 256 Booleans) meets the specified condition then execute the specified instructions (up to 9 instructions). If the condition is not met, we go on to the next control flow instruction. If Conditional\_Execute\_End and the condition is met, this is the last execution block of the shader program.

|       | Conditional_Execute_Predicates |           |          |                  |                               |       |              |  |  |  |
|-------|--------------------------------|-----------|----------|------------------|-------------------------------|-------|--------------|--|--|--|
| 47 44 | 43                             | 42        | 41 36    | 35 34            | 3316                          | 1512  | 11 0         |  |  |  |
| 0101  | Addressing                     | Condition | RESERVED | Predicate vector | Instructions type + serialize | Count | Exec Address |  |  |  |
|       |                                |           |          |                  | (9 instructions)              |       |              |  |  |  |

|       | Conditional_Execute_Predicates_End |           |          |           |                                      |       |              |  |  |  |
|-------|------------------------------------|-----------|----------|-----------|--------------------------------------|-------|--------------|--|--|--|
| 47 44 | 43                                 | 42        | 41 36    | 35 34     | 3316                                 | 1512  | 11 0         |  |  |  |
| 0110  | Addressing                         | Condition | RESERVED | Predicate | Instructions                         | Count | Exec Address |  |  |  |
|       |                                    |           |          | vector    | type + serialize<br>(9 instructions) |       |              |  |  |  |

Check the AND/OR of all current predicate bits. If AND/OR matches the condition execute the specified number of instructions. We need to AND/OR this with the kill mask in order not to consider the pixels that aren't valid. If the



EDIT DATE
4 September, 201545

R400 Sequencer Specification

PAGE 24 of 54

condition is not met, we go on to the next control flow instruction. If Conditional\_Execute\_Predicates\_End and the condition is met, this is the last execution block of the shader program.

| - | Conditional Execute Predicates No Stall |            |           |          |               |                  |       |              |  |  |  |
|---|-----------------------------------------|------------|-----------|----------|---------------|------------------|-------|--------------|--|--|--|
| - | 47 44                                   | 43         | 42        | 41 36    |               | 3316             | 1512  | 11 0         |  |  |  |
|   | 1101                                    | Addressing | Condition | RESERVED | Predicate     | Instructions     | Count | Exec Address |  |  |  |
|   |                                         |            |           |          | <u>vector</u> | type + serialize |       |              |  |  |  |
| ı |                                         |            |           |          |               | (9 instructions) |       |              |  |  |  |

|   | Conditional Execute Predicates No Stall End |            |           |              |           |                  |             |              |  |  |  |
|---|---------------------------------------------|------------|-----------|--------------|-----------|------------------|-------------|--------------|--|--|--|
|   | 47 44                                       | 43         | 42        | <u>41 36</u> | 35 34     | <u>3316</u>      | <u>1512</u> | <u>11 0</u>  |  |  |  |
| ı | 1110                                        | Addressing | Condition | RESERVED     | Predicate | Instructions     | Count       | Exec Address |  |  |  |
|   |                                             |            |           |              | vector    | type + serialize |             |              |  |  |  |
|   |                                             |            |           |              |           | (9 instructions) |             |              |  |  |  |

Same as Conditionnal Execute Predicates but the SQ is not going to wait for the predicate vector to be updated. You can only set this in the compiler if you know that the predicate set is only a refinement of the current one (like a nested if) because the optimization would still work.

| Loop_Start |       |            |                 |         |          |              |
|------------|-------|------------|-----------------|---------|----------|--------------|
|            | 47 44 | 43         | 42 17 <u>21</u> | 20 16   | 1512     | 11 0         |
|            | 0111  | Addressing | RESERVED        | loop ID | RESERVED | Jump address |

Loop Start. Compares the loop iterator with the end value. If loop condition not met jump to the address. Forward jump only. Also computes the index value. The loop id must match between the start to end, and also indicates which control flow constants should be used with the loop.

|                            | Loop_End |                 |         |          |               |      |      |
|----------------------------|----------|-----------------|---------|----------|---------------|------|------|
|                            | 47 44    | 43              | 42 24   | 23 21    | 20 16         | 1512 | 11 0 |
| 1000 Addressing RESERVED P |          | Predicate break | loop ID | RESERVED | start address |      |      |

Loop end. Increments the counter by one, compares the loop count with the end value. If loop condition met, continue, else, jump BACK to the start of the loop. If predicate break != 0, then compares predicate vector n (specified by predicate break number). If all bits cleared then break the loop.

The way this is described does not prevent nested loops, and the inclusion of the loop id make this easy to do.

|             |            |           | Conditionr      | al_Call  |            |              |
|-------------|------------|-----------|-----------------|----------|------------|--------------|
| 47 44 43 42 |            |           | 41 34           | 33 13    | 12         | 11 0         |
| 1001        | Addressing | Condition | Boolean address | RESERVED | Force Call | Jump address |

If the condition is met, jumps to the specified address and pushes the control flow program counter on the stack. If force call is set the condition is ignored and the call is made always.

|       |            | Return   | 1                        |
|-------|------------|----------|--------------------------|
| 47 44 | 43         | 42 0     |                          |
| 1010  | Addressing | RESERVED | Charles and the state of |

Pops the topmost address from the stack and jumps to that address. If nothing is on the stack, the program will just continue to the next instruction.

| Conditionnal_Jump |            |           |         |         |          |            |              |
|-------------------|------------|-----------|---------|---------|----------|------------|--------------|
| 47 44             | 43         | 42        | 41 34   | 33      | 32 13    | 12         | 11 0         |
| 1011              | Addressing | Condition | Boolean | FW only | RESERVED | Force Jump | Jump address |
|                   |            |           | address |         |          |            |              |

If force jump is set the condition is ignored and the jump is made always. If FW only is set then only forward jumps are allowed.



**EDIT DATE** 4 September, 201515 DOCUMENT-REV. NUM. GEN-CXXXXX-REVA

PAGE 25 of 54

| Allocate |       |               |          |                 |  |  |
|----------|-------|---------------|----------|-----------------|--|--|
| 47 44    | 30    |               |          |                 |  |  |
| 1100     | Debug | Buffer Select | RESERVED | Allocation size |  |  |

Buffer Select takes a value of the following:

01 - position export (ordered export)

10 - parameter cache or pixel export (ordered export)

11 – pass thru (out of order exports).

Buffer Size takes a value of the following:

00 – 1 buffer 01 – 2 buffers

15 - 16 buffers

If debug is set this is a debug alloc (ignore if debug DB\_ON register is set to off).

## 6.3 Implementation

The envisioned implementation has a buffer that maintains the state of each thread. A thread lives in a given location in the buffer during its entire life, but the buffer has FIFO qualities in that threads leave in the order that they Actually two buffers are maintained -- one for Vertices and one for Pixels. The intended implementation would allow for:

16 entries for vertices

48 entries for pixels.

From each buffer, arbitration logic attempts to select 1 thread for the texture unit and 1 (interleaved) thread for the ALU unit. Once a thread is selected it is read out of the buffer, marked as invalid, and submitted to appropriate execution unit. It is returned to the buffer (at the same place) with its status updated once all possible sequential instructions have been executed. A switch from ALU to TEX or visa-versa or a Serialize\_Execution modifier forces the thread to be returned to the buffer.

Each entry in the buffer will be stored across two physical pieces of memory - most bits will be stored in a 1 read port device. Only bits needed for thread arbitration will be stored in a highly multi-ported structure. The bits kept in the 1 read port device will be termed 'state'. The bits kept in the multi-read ported device will be termed 'status'.

'State Bits' needed include:

- 1. Control Flow Instruction Pointer (13 bits),
- Execution Count Marker 4 bits),
- Loop Iterators (4x9 bits),
- Call return pointers (4x12 bits),
- Predicate Bits (64 bits),
- Export ID (1 bit), 6.
- Parameter Cache base Ptr (7 bits),
- 8. GPR Base Ptr (8 bits),
- Context Ptr (3 bits).
- 10. LOD corrections (6x16 bits)
- 11. Valid bits (64 bits)

Absent from this list are 'Index' pointers. These are costly enough that I'm presuming that they are instead stored in the GPRs. The first seven fields above (Control Flow Ptr, Execution Count, Loop Counts, call return ptrs, Predicate bits, PC base ptr and export ID) are updated every time the thread is returned to the buffer based on how much progress has been mode on thread execution. GPR Base Ptr, Context Ptr and LOD corrections are unchanged throughout execution of the thread.



EDIT DATE

R400 Sequencer Specification

PAGE 26 of 54

24 September, 2001 4 September, 201515

'Status Bits' needed include:

- Valid Thread
- · Texture/ALU engine needed
- · Texture Reads are outstanding
- · Waiting on Texture Read to Complete
- Allocation Wait (2 bits)
- 00 No allocation needed
- 01 Position export allocation needed (ordered export)
- 10 Parameter or pixel export needed (ordered export)
- 11 pass thru (out of order export)
- Allocation Size (4 bits)
- Position Allocated
- First thread of a new context
- Event thread (NULL thread that needs to trickle down the pipe)
- Last (1 bit)
- Pulse SX (1 bit)

All of the above fields from all of the entries go into the arbitration circuitry. The arbitration circuitry will select a winner for both the Texture Engine and for the ALU engine. There are actually two sets of arbitration -- one for pixels and one for vertices. A final selection is then done between the two. But the rest of this implementation summary only considers the 'first' level selection which is similar for both pixels and vertices.

Texture arbitration requires no allocation or ordering so it is purely based on selecting the 'oldest' thread that requires the Texture Engine.

ALU arbitration is a little more complicated. First, only threads where either of Texture\_Reads\_outstanding or Waiting\_on\_Texture\_Read\_to\_Complete are 'O' are considered. Then if Allocation\_Wait is active, these threads are further filtered based on whether space is available. If the allocation is position allocation, then the thread is only considered if all 'older' threads have already done their position allocation (position allocated bits set). If the allocation is parameter or pixel allocation, then the thread is only considered if it is the oldest thread. Also a thread is not considered if it is a parameter or pixel or position allocation, has its First\_thread\_of\_a\_new\_context bit set and would cause ALU interleaving with another thread performing the same parameter or pixel or position allocation. Finally the 'oldest' of the thread that pass through the above filters is selected. If the thread needed to allocate, then at this time the allocation is done, based on Allocation\_Size. If a thread has its "last" bit set, then it is also removed from the buffer, never to return.

If I now redefine 'clauses' to mean 'how many times the thread is removed from the thread buffer for the purpose of exection by either the ALU or Texture engine', then the minimum number of clauses needed is 2—one to perform the allocation for exports (execution automatically halts after an 'Alloc' instruction) (but doesn't performs the actual allocation) and one for the actual ALU/export instructions. As the 'Alloc' instruction could be part of a texture clause (presumably the final instruction in such a clause), a thread could still execute in this minimal number of 2 clauses, even if it involved texture fetching.

The Texture\_Reads\_Outstanding bit must be updated by the sequencer, based on keeping track of how many Texture Clauses have been executed by a given thread that have not yet had there data returned. Any number above 0 results in this bit being set. We could consider forcing synchronization such that two texture clauses for a given thread may not be outstanding at any time (that would be my preference for simplicity reasons and because it would require only very little change in the texture pipe interface). This would allow the sequencer to set the bit on execution of the texture clause, and allow the texture unit to return a pointer to the thread buffer on completion that clears the bit.

# 6.4 Data dependant predicate instructions

Data dependant conditionals will be supported in the R400. The only way we plan to support those is by supporting three vector/scalar predicate operations of the form:



EDIT DATE
4 September, 201515

DOCUMENT-REV. NUM. GEN-CXXXXX-REVA PAGE 27 of 54

PRED\_SETE\_# - similar to SETE except that the result is 'exported' to the sequencer.

PRED\_SETNE\_# - similar to SETNE except that the result is 'exported' to the sequencer.

PRED\_SETGT\_# - similar to SETGT except that the result is 'exported' to the sequencer.

PRED\_SETGTE\_# - similar to SETGTE except that the result is 'exported' to the sequencer.

For the scalar operations only we will also support the two following instructions:

PRED\_SETEO\_# - SETEO PRED\_SETE1\_# - SETE1

The export is a single bit - 1 or 0 that is sent using the same data path as the MOVA instruction. The sequencer will maintain 4 sets of 64 bit predicate vectors (in fact 8 sets because we interleave two programs but only 4 will be exposed) and use it to control the write masking. This predicate is not maintained across clause boundaries. The # sign is used to specify which predicate set you want to use 0 thru 3.

Then we have two conditional execute bits. The first bit is a conditional execute "on" bit and the second bit tells us if we execute on 1 or 0. For example, the instruction:

P0\_ADD\_# R0,R1,R2

Is only going to write the result of the ADD into those GPRs whose predicate bit is 0. Alternatively, P1\_ADD\_# would only write the results to the GPRs whose predicate bit is set. The use of the P0 or P1 without precharging the sequencer with a PRED instruction is undefined.

{Issue: do we have to have a NOP between PRED and the first instruction that uses a predicate?}

# 6.5 HW Detection of PV,PS

Because of the control program, the compiler cannot detect statically dependant instructions. In the case of non-masked writes and subsequent reads the sequencer will insert uses of PV,PS as needed. This will be done by comparing the read address and the write address of consecutive instructions. For masked writes, the sequencer will insert NOPs wherever there is a dependant read/write.

The sequencer will also have to insert NOPs between PRED\_SET and MOVA instructions and their uses.

# 6.6 Register file indexing

Because we can have loops in fetch clause, we need to be able to index into the register file in order to retrieve the data created in a fetch clause loop and use it into an ALU clause. The instruction will include the base address for register indexing and the instruction will contain these controls:

| Bit7 | Bit 6 |                     |
|------|-------|---------------------|
| 0    | 0     | 'absolute register  |
| 0    | 1     | 'relative register' |
| 1    | 0     | 'previous vector'   |
| 1    | 1     | 'previous scalar'   |

In the case of an absolute register we just take the address as is. In the case of a relative register read we take the base address and we add to it the loop\_index and this becomes our new address that we give to the shader pipe.

The sequencer is going to keep a loop index computed as such:

Index = Loop\_iterator\*Loop\_step + Loop\_start.

We loop until loop\_iterator = loop\_count. Loop\_step is a signed value [-128...127]. The computed index value is a 10 bit counter that is also signed. Its real range is [-256,256]. The tenth bit is only there so that we can provide an out of range value to the "indexing logic" so that it knows when the provided index is out of range and thus can make the necessary arrangements.



EDIT DATE

R400 Sequencer Specification

PAGE 28 of 54

24 September, 2001 <u>4 September, 201545</u>

# 6.7 Debugging the Shaders

In order to be able to debug the pixel/vertex shaders efficiently, we provide 2 methods.

# 6.7.1 Method 1: Debugging registers

Current plans are to expose 2 debugging, or error notification, registers:

- 1. address register where the first error occurred
- 2 count of the number of errors

The sequencer will detect the following groups of errors:

- count overflow
- constant indexing overflow
- register indexing overflow

Compiler recognizable errors:

- jump errors
  - relative jump address > size of the control flow program
- call stack
  - call with stack full

return with stack empty

A jump error will always cause the program to break. In this case, a break means that a clause will halt execution, but allowing further clauses to be executed.

With all the other errors, program can continue to run, potentially to worst-case limits. The program will only break if the DB\_PROB\_BREAK register is set.

If indexing outside of the constant or the register range, causing an overflow error, the hardware is specified to return the value with an index of 0. This could be exploited to generate error tokens, by reserving and initializing the 0th register (or constant) for errors.

{ISSUE : Interrupt to the driver or not?}

#### 6.7.2 Method 2: Exporting the values in the GPRs

1) The sequencer will have a debug active, count register and an address register for this mode.

Under the normal mode execution follows the normal course.

Under the debug mode it is assumed that the program is always exporting n debug vectors and that all other exports to the SX block (position, color, z, ect) will been turned off (changed into NOPs) by the sequencer (even if they occur before the address stated by the ADDR debug register).

#### Pixel Kill Mask

A vector of 64 bits is kept by the sequencer per group of pixels/vertices. Its purpose is to optimize the texture fetch requests and allow the shader pipe to kill pixels using the following instructions:

MASK\_SETE MASK\_SETNE MASK\_SETGT MASK\_SETGTE

## 8. Multipass vertex shaders (HOS)

Multipass vertex shaders are able to export from the 6 last clauses but to memory ONLY.



4 September, 201515

DOCUMENT-REV. NUM. GEN-CXXXXX-REVA PAGE 29 of 54

# 9. Register file allocation

The register file allocation for vertices and pixels can either be static or dynamic. In both cases, the register file in managed using two round robins (one for pixels and one for vertices). In the dynamic case the boundary between pixels and vertices is allowed to move, in the static case it is fixed to 128-VERTEX\_REG\_SIZE for vertices and PIXEL\_REG\_SIZE for pixels.



Above is an example of how the algorithm works. Vertices come in from top to bottom; pixels come in from bottom to top. Vertices are in orange and pixels in green. The blue line is the tail of the vertices and the green line is the tail of the pixels. Thus anything between the two lines is shared. When pixels meets vertices the line turns white and the boundary is static until both vertices and pixels share the same "unallocated bubble". Then the boundary is allowed to move again. The numbering of the GPRs starts from the bottom of the picture at index 0 and goes up to the top at index 127.

## 10. Fetch Arbitration

The fetch arbitration logic chooses one of the 8-<u>n</u> potentially pending fetch clauses to be executed. The choice is made by looking at the fifos-from-7 to-0<u>vs and Ps reservation stations</u> and picking the first one ready to execute. Once chosen, the clause state machine will send one 2x2 fetch per clock (or 4 fetches in one clock every 4 clocks) until all the fetch instructions of the clause are sent. This means that there cannot be any dependencies between two fetches of the same clause.

The arbitrator will not wait for the fetches to return prior to selecting another clause for execution. The fetch pipe will be able to handle up to X(?) in flight fetches and thus there can be a fair number of active clauses waiting for their fetch return data.

## 11. ALU Arbitration

ALU arbitration proceeds in almost the same way than fetch arbitration. The ALU arbitration logic chooses one of the 8-n potentially pending ALU clauses to be executed. The choice is made by looking at the Vs and Ps reservation stations and picking the first one ready to executeThe choice is made by looking at the fifs from 7 to 0 and picking the first one ready to execute. There are two ALU arbiters, one for the even clocks and one for the odd clocks. For example, here is the sequencing of two interleaved ALU clauses (E and O stands for Even and Odd sets of 4 clocks):

Einst0 Oinst0 Einst1 Oinst1 Einst2 Oinst2 Einst0 Oinst3 Einst1 Oinst4 Einst2 Oinst0...



4 September, 201515

DOCUMENT-REV. NUM. GEN-CXXXXX-REVA PAGE 31 of 54

Proceeding this way hides the latency of 8 clocks of the ALUs. Also note that the interleaving also occurs across clause boundaries

# 12. Handling Stalls

When the output file is full, the sequencer prevents the ALU arbitration logic from selecting the last clause (this way nothing can exit the shader pipe until there is place in the output file. If the packet is a vertex packet and the position buffer is full (POS\_FULL) then the sequencer also prevents a thread from entering the an exporting clause (3?). The sequencer will set the OUT\_FILE\_FULL signal n clocks before the output file is actually full and thus the ALU arbiter will be able read this signal and act accordingly by not preventing exporting clauses to proceed.

#### 13. Content of the reservation station FIFOs

The reservation FIFOs contain the state of the vector of pixels and vertices. We have two sets of those: one for pixels, and one for vertices. They contain 3 bits of Render State 7 bits for the base address of the GPRs, some bits for LOD correction and coverage mask information in order to fetch for only valid pixels, the quad address.

#### 14. The Output File

The output file is where pixels are put before they go to the RBs. The write BW to this store is 256 bits/clock. Just before this output file are staging registers with write BW 512 bits/clock and read BW 256 bits/clock. The staging registers are 4x128 (and there are 16 of those on the whole chip).

#### 15. IJ Format

The IJ information sent by the PA is of this format on a per quad basis:

We have a vector of IJ's (one IJ per pixel at the centroid of the fragment or at the center of the pixel depending on the mode bit). The interpolation is done at a different precision across the 2x2. The upper leftAll pixel's parameters are always interpolated at full 20x24 mantissa precision. Then the result of the interpolation along with the difference in IJ in reduced precision is used to interpolate the parameter for the other three pixels of the 2x2. Here is how we do it:

Assuming P0 is the interpolated parameter at Pixel 0 having the barycentric coordinates I(0), J(0) and so on for P1,P2 and P3. Also assuming that A is the parameter value at V0 (interpolated with I), B is the parameter value at V1 (interpolated with J) and C is the parameter value at V2 (interpolated with (1-I-J).

$$P0 = A + I(0) * (B - A) + J(0) * (C - A)$$

$$P1 = A + I(1) * (B - A) + J(1) * (C - A)$$

$$P2 = A + I(2) * (B - A) + J(2) * (C - A)$$

$$P3 = A + I(3) * (B - A) + J(3) * (C - A)$$



P0 is computed at 20x24 mantissa precision and P1 to P3 are computed at 8X24 mantissa precision. So far no visual degradation of the image was seen using this scheme.

Multiplies (Full Precision): 28 Multiplies (Reduced precision): 6 Subtracts 19x24 (Parameters): 2 Adds: 8

.....

FORMAT OF P0's IJ : Mantissa 20 Exp 4 for I + Sign



EDIT DATE

R400 Sequencer Specification

PAGE 32 of 54

4 September, 201515

Mantissa 20 Exp 4 for J + Sign

```
FORMAT of Deltas (x3):Mantissa 8 Exp 4 for I + Sign
Mantissa 8 Exp 4 for J + Sign
```

```
Total number of bits : 20*2-8 + 8*6 + 4*8 + 4*2 = 200.
```

All numbers are kept using the un-normalized floating point convention: if exponent is different than 0 the number is normalized if not, then the number is un-normalized. The maximum range for the IJs (Full precision) is +/- 63 1024.and the range for the Deltas is +/- 127.

#### 15.1 Interpolation of constant attributes

Because of the floating point imprecision, we need to take special provisions if all the interpolated terms are the same or if two of the barycentric coordinatesterms are the same.

We start with the premise that if A = B and B = C and C = A, then P0,1,2,3 = A. Since one or more of the IJ terms may be zero, so we extend this to:

#### Staging Registers

In order for the reuse of the vertices to be 14, the sequencer will have to re-order the data sent IN ORDER by the VGT for it to be aligned with the parameter cache memory arrangement. Given the following group of vertices sent by the VGT:

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 || 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 || 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 || 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63

The sequencer will re-arrange them in this fashion:

 $0\ 1\ 2\ 3\ 16\ 17\ 18\ 19\ 32\ 33\ 34\ 35\ 48\ 49\ 50\ 51\ ||\ 4\ 5\ 6\ 7\ 20\ 21\ 22\ 23\ 36\ 37\ 38\ 39\ 52\ 53\ 54\ 55\ ||\ 8\ 9\ 10\ 11\ 24\ 25\ 26\ 27\ 40\ 41\ 42\ 43\ 56\ 57\ 58\ 59\ ||\ 12\ 13\ 14\ 15\ 28\ 29\ 30\ 31\ 44\ 45\ 46\ 47\ 60\ 61\ 62\ 63$ 

The || markers show the SP divisions. In the event a shader pipe is broken, the VGT will send padding to account for the missing pipe. For example, if SP1 is broken, vertices 4 5 6 7 20 21 22 23 36 37 38 39 52 53 54 55 will still be sent by the VGT to the SQ BUT will not be processed by the SP and thus should be considered invalid (by the SU and VGT).

Exhibit 2031.docR400\_Sequencer.doc 71818 Bytes\*\*\* © ATI Confidential. Reference Copyright Notice on Cover Page © \*\*\*

AMD1044\_0257533



4 September, 201545

DOCUMENT-REV. NUM. GEN-CXXXXX-REVA PAGE 33 of 54

The most straightforward, non-compressed interface method would be to convert, in the VGT, the data to 32-bit floating point prior to transmission to the VSISRs. In this scenario, the data would be transmitted to (and stored in) the VSISRs in full 32-bit floating point. This method requires three 24-bit fixed-to-float converters in the VGT. Unfortunately, it also requires and additional 3,072 bits of storage across the VSISRs. This interface is illustrated in Figure 11Figure 14Figure 41. The area of the fixed-to-float converters and the VSISRs for this method is roughly estimated as 0.759sqmm using the R300 process. The gate count estimate is shown in Figure 10Figure 10Figure 40.



Figure 10:Area Estimate for VGT to Shader Interface



Figure 11:VGT to Shader Interface



EDIT DATE

R400 Sequencer Specification

PAGE 34 of 54

24 September, 2001 4 September, 201545

# 17. The parameter cache

The parameter cache is where the vertex shaders export their data. It consists of 16 128x128 memories (1R/1W). The reuse engine will make it so that all vertexes of a given primitive will hit different memories. The allocation method for these memories is a simple round robin. The parameter cache pointers are mapped in the following way: 4MSBs are the memory number and the 7 LSBs are the address within this memory.

| MEMORY NUMBER | ADDRESS |
|---------------|---------|
| 4 bits        | 7 bits  |

The PA generates the parameter cache addresses as the positions come from the SQ. All it needs to do is keep a Current\_Location pointer (7 bits only) and as the positions comes increment the memory number. When the memory number field wraps around, the PA increments the Current\_Location by VS\_EXPORT\_COUNT (a snooped register from the SQ). As an example, say the memories are all empty to begin with and the vertex shader is exporting 8 parameters per vertex (VS\_EXPORT\_COUNT = 8). The first position received is going to have the PC address 00000000000 the second one 00010000000, third one 00100000000 and so on up to 11110000000. Then the next position received (the 17<sup>th</sup>) is going to have the address 0000001000, the 18<sup>th</sup> 00010001000, the 19<sup>th</sup> 00100001000 and so on. The Current\_location is NEVER reset BUT on chip resets. The only thing to be careful about is that if the SX doesn't send you a full group of positions (<64) then you need to fill the address space so that the next group starts correctly aligned (for example if you receive only 33 positions then you need to add 2\*VS\_EXPORT\_COUNT to Current\_Location and reset the memory count to 0 before the next vector begins).

# 17.1 Export restrictions

#### 17.1.1 Pixel exports:

Pixels can export 1,2,3 or 4 color buffers to the SX( +z). The exports will be done in order. The PRED\_OPTIMIZE function has to be turned of if the exports are done using interleaved predicated instructions. The exports will always be ordered to the SX.

#### 17.1.2 Vertex exports:

Position or parameter caches can be exported in any order in the shader program. It is always better to export posistion as soon as possible. Position has to be exported in a single export block (no texture instructions can be placed between the exports). Parameter cache exports can be done in any order with texture instructions interleaved. The PRED\_OPTIMIZE function has to be turned of if the exports are done using interleaved predicated instructions to the Parameter cache (see Arbitration restrictions for details). The exports will always be allocated in order to the SX.

#### 17.1.3 Pass thru exports:

Pass thru exports have to be done in groups of the form:

```
Alloc 4 (8 or 12)
Execute ALU(ADDR) ALU(DATA) ALU(DATA) ALU(DATA)...
```

They cannot have texture instructions interleaved in the export block. These exports are not guaranteed to be ordered.

Also, when doing a pass thru export, Position MUST be exported AFTER all pass thru exports. This position export is used to synchronize the chip when doing a transition from pass thru shader to regular shader and vice versa.

#### 17.2 Arbitration restrictions

Here are the Sequencer arbitration restrictions:

- 1) Cannot execute a serialized thread if the corresponding texture pending bit is set
- 2) Cannot allocate position if any older thread has not allocated position
- If last thread is marked as not valid AND marked as last and we are about to execute the second to oldest thread also marked last then:



| ORIGINATE D   | ATE  |
|---------------|------|
| 24 September, | 2001 |

EDIT DATE 4 September, 201515 DOCUMENT-REV. NUM.
GEN-CXXXXX-REVA

PAGE 35 of 54

- a. Both threads must be from the same context (cannot allow a first thread)
- b. Must turn off the predicate optimization for the second thread
   Cannot execute a texture clause if texture reads are pending
- 5) Cannot execute last if texture pending (even if not serial)

## 18. Export Types

The export type (or the location where the data should be put) is specified using the destination address field in the ALU instruction. Here is a list of all possible export modes:

## 18.1 Vertex Shading

0:15 - 16 parameter cache 16:31 - Empty (Reserved?) 32 - Export Address

33:40 - 8 vertex exports to the frame buffer and index

41:47 - Empty

48:55 - 8 debug export (interpret as normal vertex export)

60 - export addressing mode

61 - Empty

62 - position

63 - sprite size export that goes with position export

(point\_h,point\_w,edgeflag,misc)

# 18.2 Pixel Shading

- Color for buffer 0 (primary)

1 - Color for buffer 1

2 - Color for buffer 23 - Color for buffer 3

4:7 - Empty

B - Buffer 0 Color/Fog (primary)

9 - Buffer 1 Color/Fog 10 - Buffer 2 Color/Fog

11 - Buffer 3 Color/Fog

12:15 - Empty

16:31 - Empty (Reserved?)

32 - Export Address

33:40 - 8 exports for multipass pixel shaders.

41:47 - Empty

48:55 - 8 debug exports (interpret as normal pixel export)

60 - export addressing mode

61:62 - Empty

63 - Z for primary buffer (Z exported to 'alpha' component)

# 19. Special Interpolation modes

#### 19.1 Real time commands

We are unable to use the parameter memory since there is no way for a command stream to write into it. Instead we need to add three 16x128 memories (one for each of three vertices x 16 interpolants). These will be mapped onto the register bus and written by type 0 packets, and output to the the parameter busses (the sequencer and/or PA need to be able to address the reatime parameter memory as well as the regular parameter. For higher performance we should be able to view them as two banks of 16 and do double buffering allowing one to be loaded, while the other is rasterized with. Most overlay shaders will need 2 or 4 scalar coordinates, one option might be to restrict the memory to 16x64 or 32x64 allowing only two interpolated scalars per cycle, the only problem I see with this is, if we



EDIT DATE

R400 Sequencer Specification

PAGE

24 September, 2001

4 September, 201515

36 of 54

view support for 16 vector-4 interpolants important (true only if we map Microsoft's high priority stream to the realtime stream), then the PA/sequencer need to support a realtime-specific mode where we need to address 32 vectors of parameters instead of 16. This mode is triggered by the primitive type: REAL TIME. The actual memories are in the in the SX blocks. The parameter data memories are hooked on the RBBM bus and are loaded by the CP using register mapped memory.

# 19.2 Sprites/ XY screen coordinates/ FB information

When working with sprites, one may want to overwrite the parameter 0 with SC generated data. Also, XY screen coordinates may be needed in the shader program. This functionality is controlled by the gen\_10 register (in SQ) in conjunction with the SND\_XY register (in SC). Also it is possible to send the faceness information (for OGL front/back special operations) to the shader using the same control register. Here is a list of all the modes and how they interact together:

Gen\_st is a bit taken from the interface between the SC and the SQ. This is the MSB of the primitive type. If the bit is set, it means we are dealing with Point AA, Line AA or sprite and in this case the vertex values are going to generated between 0 and 1.

Param\_Gen\_IO disable, snd\_xy disable, no gen\_st - IO = No modification Param\_Gen\_IO disable, snd\_xy disable, gen\_st - IO = No modification Param\_Gen\_IO disable, snd\_xy enable, no gen\_st - IO = No modification

Param\_Gen\_I0 disable, snd\_xy enable, no gen\_st – I0 = No modification Param\_Gen\_I0 disable, snd\_xy enable, gen\_st – I0 = No modification

Param Gen I0 enable, snd xy disable, no gen st – I0 = garbage, garbage, garbage, faceness

Param\_Gen\_I0 enable, snd\_xy disable, gen\_st - I0 = garbage, garbage, s, t

Param\_Gen\_I0 enable, snd\_xy enable, no gen\_st - I0 = screen x, screen y, garbage, faceness

Param\_Gen\_I0 enable, snd\_xy enable, gen\_st - I0 = screen x, screen y, s, t

#### 19.3 Auto generated counters

In the cases we are dealing with multipass shaders, the sequencer is going to generate a vector count to be able to both use this count to write the 1<sup>st</sup> pass data to memory and then use the count to retrieve the data on the 2<sup>nd</sup> pass. The count is always generated in the same way but it is passed to the shader in a slightly different way depending on the shader type (pixel or vertex). This is toggled on and off using the GEN\_INDEX register. The sequencer is going to keep two counters, one for pixels and one for vertices. Every time a full vector of vertices or pixels is written to the GPRs the counter is incremented. Every time a state change is detected, the corresponding counter is reset. While there is only one count broadcast to the GPRs, the LSB are hardwired to specific values making the index different for all elements in the vector.

#### 1931 Vertex shaders

In the case of vertex shaders, if GEN\_INDEX is set, the data will be put into the x field of the third register (it means that the compiler must allocate 3 GPRs in all multipass vertex shader modes).

#### 19.3.2 Pixel shaders

In the case of pixel shaders, if GEN\_INDEX is set and Param\_Gen\_I0 is enabled, the data will be put in the x field of the  $2^{nd}$  register (R1.x), else if GEN\_INDEX is set the data will be put into the x field of the  $1^{st}$  register (R0.x).





Figure 12: GPR input mux Control

# 20. State management

Every clock, the sequencer will report to the CP the oldest states still in the pipe. These are the states of the programs as they enter the last ALU clause.

#### 20.1 Parameter cache synchronization

In order for the sequencer not to begin a group of pixels before the associated group of vertices has finished, the sequencer will keep a 6 bit count per state (for a total of 8 counters). These counters are initialized to 0 and every time a vertex shader exports its data TO THE PARAMETER CACHE, the corresponding pointer is incremented. When the SC sends a new vector of pixels with the SC\_SQ\_new\_vector bit asserted, the sequencer will first check if the count is greater than 0 before accepting the transmission (it will in fact accept the transmission but then lower its ready to receive). Then the sequencer waits for the count to go to one and decrements it. The sequencer can then issue the group of pixels to the interpolators. Every time the state changes, the new state counter is initialized to 0.

#### 21. XY Address imports

The SC will be able to send the XY addresses to the GPRs. It does so by interleaving the writes of the IJs (to the IJ buffer) with XY writes (to the XY buffer). Then when writing the data to the GPRs, the sequencer is going to interpolate the IJ data or pass the XY data thru a Fix→float converter and expander and write the converted values to the GPRs. The Xys are currently SCREEN SPACE COORDINATES. The values in the XY buffers will wrap. See section 19.2 for details on how to control the interpolation in this mode.

#### 21.1 Vertex indexes imports

In order to import vertex indexes, we have 16 8x96 staging registers. These are loaded one line at a time by the VGT block (96 bits). They are loaded in floating point format and can be transferred in 4 or 8 clocks to the GPRs.

## 22. Registers

Please see the auto-generated web pages for register definitions. Control

REG\_DYNAMIC Dynamic allocation (pixel/vertex) of the register file on or off.

|  | AA       | ORIGINATE DATE     | EDIT DATE                                                                       | R400 Sequencer Specification           | PAGE           |  |  |
|--|----------|--------------------|---------------------------------------------------------------------------------|----------------------------------------|----------------|--|--|
|  | 7700     | 24 September, 2001 | 4 September, 201545                                                             |                                        | 38 of 54       |  |  |
|  | REG_SIZE | _PIX Size o        | f the register file's pixel por                                                 | tion (minimal size when dynamic alloc  | ation turned   |  |  |
|  |          | on)                |                                                                                 |                                        |                |  |  |
|  | REG_SIZE | _VTX Size o        | f the register file's vertex po                                                 | ortion (minimal size when dynamic allo | cation turned  |  |  |
|  |          | en)                |                                                                                 |                                        |                |  |  |
|  | ARBITRA  | FION_POLICY policy | of the arbitration between \                                                    | rertexes and pixels                    |                |  |  |
|  |          | SE_VTX start p     | start point for the vertex instruction store (RT always ends at vertex_base and |                                        |                |  |  |
|  | ****     |                    | Begins-at-0)                                                                    |                                        |                |  |  |
|  | INST_BAS | SE_PIX start p     | oint for the pixel shader ins                                                   | truction store                         |                |  |  |
|  | ONE_THR  | EAD debug          | state register. Only allows                                                     | one program at a time into the GPRs    |                |  |  |
|  | ONE_ALU  | debug              | state register. Only allows                                                     | one ALU program at a time to be exe    | cuted (instead |  |  |
|  |          | ~f ?)\             |                                                                                 |                                        |                |  |  |

INSTRUCTION of 2)

This is where the CP puts the base address of the instruction writes and type (auto-incremented on reads/writes) Register mapped

CONSTANTS 512\*4 ALU constants + 32\*6 Texture state 32 bits registers (logically mapped)
CONSTANTS\_RT 256\*4 ALU constants + 32\*6 texture states? (physically mapped)
CONSTANT\_EO\_RT This is the size of the space reserved for real time in the constant store (from 0 to

CONSTANT\_EO\_RT). The re-mapping table operates on the rest of the memory TSTATE\_EO\_RT — This is the size of the space reserved for real time in the fetch state store (from 0 to TSTATE\_EO\_RT). The re-mapping table operates on the rest of the memory

### 22.2 Context

CF\_LOOP\_STEP

| DO DAGE                 | hanna majukan farakkan miyat akandan in khan imakayakian mkana                                   |
|-------------------------|--------------------------------------------------------------------------------------------------|
| PS_BASE                 | base pointer for the pixel shader in the instruction store                                       |
| VS_BASE                 | base pointer for the vertex shader in the instruction store                                      |
| VS_CF_SIZE              | size of the vertex shader (# of instructions in control program/2)                               |
| PS_CF_SIZE              | size of the pixel shader (# of instructions in control program/2)                                |
| PS_SIZE                 | size of the pixel shader (cntl+instructions)                                                     |
| VS_SIZE                 | size of the vertex shader (cntl+instructions)                                                    |
| PS_NUM_REG              | number of GPRs to allocate for pixel shader programs                                             |
| VS_NUM_REG              | number of GPRs to allocate for vertex shader programs                                            |
| PARAM_SHADE             | One 16 bit register specifying which parameters are to be gouraud shaded (0 = flat, 1 = gouraud) |
| PARAM_WRAP              | 64 bits: for which parameters (and channels (xyzw)) do we do the cyl wrapping                    |
|                         | (0=linear, 1=cylindrical).                                                                       |
| PS_EXPORT_MODE          | Oxxxx : Normal mode                                                                              |
|                         | 1xxxx: Multipass mode                                                                            |
|                         | If normal, bbbz where bbb is how many colors (0-4) and z is export z or not                      |
|                         | If multipass 1-12 exports for color.                                                             |
| VS_EXPORT_MODE          | 0: position (1 vector), 1: position (2 vectors), 3:multipass                                     |
| VS_EXPORT_COUNT         | Number of locations exported by the VS (and thus number of interpolated                          |
| parameters)             |                                                                                                  |
| PARAM GEN IO            | Do we overwrite or not the parameter 0 with XY data and generated T and S values                 |
| GEN_INDEX               | Auto generates an address from 0 to XX. Puts the results into R0-1 for pixel shaders             |
|                         | and R2 for vertex shaders                                                                        |
| CONST_BASE_VTX (9 bits  | s)Logical Base address for the constants of the Vertex shader                                    |
|                         | ) Logical Base address for the constants of the Pixel shader                                     |
| CONST_SIZE_PIX (8-bits) | Size of the logical constant store for pixel shaders                                             |
| CONST_SIZE_VTX (8 bite) | Size of the logical constant store for vertex shaders                                            |
|                         | Turns on the predicate bit optimization (if of, conditional execute predicates is                |
| All Man                 | always-executed).                                                                                |
| CF_BOOLEANS             | 256 boolean bits                                                                                 |
| CF_LOOP_COUNT           | 32x8 bit counters (number of times we traverse the loop)                                         |
| CF_LOOP_START           | 32x8 bit counters (init value used in index computation)                                         |

Exhibit 2031.docR490\_Sequencer.doc 71818 Bytes\*\*\*\* © ATI Confidential. Reference Copyright Notice on Cover Page © \*\*\*

32x8 bit counters (step value used in index computation)

Formatted: Bullets and Numbering



**EDIT DATE** 4 September, 201515 DOCUMENT-REV. NUM. GEN-CXXXXX-REVA

PAGE 39 of 54

Formatted: Bullets and Numbering

### 23. DEBUG Registers

#### 23.1 Context

DB\_PROB\_ADDR instruction address where the first problem occurred

DB\_PROB\_COUNT number of problems encountered during the execution of the program

DB\_PROB\_BREAK break the clause if an error is found. DB ON turns on an off debug method 2

DB\_INST\_COUNT instruction counter for debug method 2

DB BREAK ADDR break address for method number 2

#### 23-2 Control

DB ALUCST MEMSIZE Size of the physical ALU constant memory DB TSTATE MEMSIZE Size of the physical texture state memory

### 24.23. Interfaces

#### 24.123.1 External Interfaces

Whenever an x is used, it means that the bus is broadcast to all units of the same name. For example, if a bus is named SQ→SPx it means that SQ is going to broadcast the same information to all SP instances.

### 24.223.2 SC to SP Interfaces

### 24.2.123.2.1 SC SP#

There is one of these interfaces at front of each of the SP (buffer to stage pixel interpolators). This interface transmits the I,J data for pixel interpolation. For the entire system, two guads per clock are transferred to the 4 SPs, so each of these 4 interfaces transmits one half of a quad per clock. The interface below describes a half of a quad worth of data.

The actual data which is transferred per quad is

Ref Pix I => S4.20 Floating Point I value \*4 Ref Pix J => S4.20 Floating Point J value \*4 Delta Pix I (x3) => S4.8 Floating Point Delta I value

Delta Pix J (x3) => S4.8 Floating Point Delta J value This equates to a total of 128-200 bits which transferred over 2 clocks

and therefor needs an interface 64100 bits wide

Additionally, X,Y data (12-bit unsigned fixed) is conditionally sent across this data bus over the same wires in an additional clock. The X,Y data is sent on the lower 24 bits of the data bus with faceness in the msb.

Transfers across these interfaces are synchronized with the SC\_SQ IJ Control Bus transfers.

The data transfer across each of these busses is controlled by a IJ\_BUF\_INUSE\_COUNT in the SC. Each time the SC has sent a pixel vector's worth of data to the SPs, he will increment the IJ\_BUF\_INUSE\_COUNT count. Prior to sending the next pixel vectors data, he will check to make sure the count is less than MAX\_BUFER\_MINUS\_2, if not the SC will stall until the SQ returns a pipelined pulse to decrement the count when he has scheduled a buffer free. Note: We could/may optimize for the case of only sending only IJ to use all the buffers to pre-load more. 

Currently it is planned for the SP to hold 2 double buffers of I,J data and two buffers of X,Y data, so if either X,Y or Centers and Centroids are on, then the SC can send two Buffers.

In at least the initial version, the SC shall send 16 quads per pixel vector even if the vector is not full. This will increment buffer write address pointers correctly all the time. (We may revisit this for both the SX,SP,SQ and add a EndOfVector signal on all interfaces to quit early. We opted for the simple mode first with a belief that only the end of

Exhibit 2031.docR400\_Sequencer.doc 71818 Bytes\*\*\* © ATI Confidential. Reference Copyright Notice on Cover Page © \*\*\*

Formatted: Bullets and Numbering

Formatted: Bullets and Numbering

Formatted: Bullets and Numbering

Formatted: Bullets and Numbering



| ORIGINATE DATE     | EDIT DATE           | R400 Sequencer Specification |
|--------------------|---------------------|------------------------------|
| 24 September, 2001 | 4 September, 201515 |                              |

packet and multiple new vector signals should cause a partial vector and that this would not really be significant performance hit.)

| Name                  | Bits  | Description                                                                                                                                                                                                                                                                                    |  |  |
|-----------------------|-------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--|--|
| SC_SP#_data           | 64100 | IJ information sent over 2 clocks (or X,Y in 24 LSBs with faceness in upper bit)                                                                                                                                                                                                               |  |  |
|                       |       | Type 0 or 1, First clock I, second clk J         Field ULC URC LLC LRC         Bits [63:39] [38:26] [25:13] [12:0]         Format SE4M20 - SE4M20SE4M8 - SE4M20SE4M8 - SE4M20SE4M8         Type 2         Field Face X Y         Bits [63] [23:12] [11:0]         Format Bit Unsigned Unsigned |  |  |
| SC SP# valid          | 1     | Valid                                                                                                                                                                                                                                                                                          |  |  |
| SC SP# last quad data | 1     | This bit will be set on the last transfer of data per quad.                                                                                                                                                                                                                                    |  |  |
| SC_SP#_type           | 2     | 0 -> Indicates centroids                                                                                                                                                                                                                                                                       |  |  |
|                       |       | 1 -> Indicates centers                                                                                                                                                                                                                                                                         |  |  |
|                       |       | 2 -> Indicates X,Y Data and faceness on data bus                                                                                                                                                                                                                                               |  |  |
|                       |       | The SC shall look at state data to determine how many types to send for the interpolation process.                                                                                                                                                                                             |  |  |

The # is included for clarity in the spec and will be replaced with a prefix of u#\_ in the verilog module statement for the SC and the SP block will have neither because the instantiation will insert the prefix.

### 24.2.223.2.2 SC\_SQ

This is the control information sent to the sequencer in order to synchronize and control the interpolation and/or loading data into the GPRs needed to execute a shader program on the sent pixels. This data will be sent over two clocks per transfer with 1 to 16 transfers. Therefore the bus (approx 94 bits) could be folded in half to approx 49 bits.

| [ 01       | P): 6 | D                                           |
|------------|-------|---------------------------------------------|
| Name       | Bits  | Description                                 |
| SC_SQ_data | 46    | Control Data sent to the SQ                 |
|            |       | 1 clk transfers                             |
|            |       | Event – valid data consist of event_id and  |
|            |       | state_id. Instruct SQ to post an            |
|            |       | event vector to send state id and           |
|            |       | event id through request fifo               |
|            |       | and onto the reservation stations           |
|            |       | making sure state id and/or event_id        |
|            |       | gets back to the CP. Events only            |
|            |       | follow end of packets so no pixel           |
|            |       | vectors will be in progress.                |
|            |       | vodoro wiii be iii progress.                |
|            |       | Empty Quad Mask – Transfer Control data     |
|            |       | consisting of pc dealloc                    |
|            |       | or new vector. Receipt of this is to        |
|            |       | transfer pc_dealloc or new_vector           |
|            |       |                                             |
|            |       | without any valid quad data. New            |
|            |       | vector will always be posted to             |
|            |       | request fifo and pc_dealloc will be         |
|            |       | attached to any pixel vector                |
|            |       | outstanding or posted in request fifo       |
|            |       | if no valid quad outstanding.               |
|            |       | 2 clk transfers                             |
|            |       | Quad Data Valid – Sending quad data with or |
|            |       | without new_vector or pc_dealloc.           |
|            |       | New vector will be posted to request        |
|            |       | fifo with or without a pixel vector and     |

Exhibit 2031 doc R400\_Sequencer.doc 71818 Bytes\*\*\*\* © ATI Confidential. Reference Copyright Notice on Cover Page © \*\*\*

Formatted: Bullets and Numbering

PAGE 40 of 54

|                                                                              | ORIGINATE<br>24 Septembe | _, | EDIT DATE 4 September, 201515                                             | DOCUMENT-REV. NUM.<br>GEN-CXXXXX-REVA                                                                                                                                     | PAGE<br>41 of 54 |
|------------------------------------------------------------------------------|--------------------------|----|---------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------|
|                                                                              |                          |    | vector unle<br>this case the<br>posted in the<br>Filler quade<br>The Quad | will be posted with a pixel uses none is in progress. In he pc_dealloc will be he request queue. s will be transferred with mask set but the pixel ding pixel mask set to |                  |
| SC_SQ_valid 1 SC sending valid data, 2 <sup>nd</sup> clk could be all zeroes |                          |    |                                                                           |                                                                                                                                                                           |                  |

SC\_SQ\_data - first clock and second clock transfers are shown in the table below.

| Name                           | BitField | Bits | Description                                                                 |
|--------------------------------|----------|------|-----------------------------------------------------------------------------|
|                                |          |      |                                                                             |
| 1 <sup>st</sup> Clock Transfer |          |      |                                                                             |
| SC_SQ_event                    | 0        | 1    | This transfer is a 1 clock event vector Force quad_mask =                   |
|                                |          |      | new_vector=pc_dealloc=0                                                     |
| SC_SQ_event_id                 | [4:1]    | 4    | This field identifies the event 0 => denotes an End Of State Event 1 => TBD |
| SC_SQ_pc_dealloc               | [7:5]    | 3    | Deallocation token for the Parameter Cache                                  |
| SC_SQ_new_vector               | 8        | 1    | The SQ must wait for Vertex shader done count > 0 and after                 |
|                                |          |      | dispatching the Pixel Vector the SQ will decrement the count.               |
| SC_SQ_quad_mask                | [12:9]   | 4    | Quad Write mask left to right SP0 => SP3                                    |
| SC_SQ_end_of_prim              | 13       | 1    | End Of the primitive                                                        |
| SC_SQ_state_id                 | [16:14]  | 3    | State/constant pointer (6*3+3)                                              |
| SC_SQ_pix_mask                 | [32:17]  | 16   | Valid bits for all pixels SP0=>SP3 (UL,UR,LL,LR)                            |
| SC_SQ_provok_vtx               | [37:36]  | 2    | Provoking vertex for flat shading                                           |
| SC_SQ_pc_ptr0                  | [48:38]  | 11   | Parameter Cache pointer for vertex 0                                        |
|                                |          |      |                                                                             |
| 2nd Clock Transfer             |          |      |                                                                             |
| SC_SQ_pc_ptr1                  | [10:0]   | 11   | Parameter Cache pointer for vertex 1                                        |
| SC_SQ_pc_ptr2                  | [21:11]  | 11   | Parameter Cache pointer for vertex 2                                        |
| SC_SQ_lod_correct              | [45:22]  | 24   | LOD correction per quad (6 bits per quad)                                   |
| SC_SQ_prim_type                | [48:46]  | 33   | Stippled line and Real time command need to load tex cords from             |
|                                |          |      | alternate buffer                                                            |
|                                |          |      | 0000: Normal-Sprite (point)                                                 |
|                                |          |      | <u>001: Line</u>                                                            |
|                                |          |      | 010: Tri_rect                                                               |
|                                |          |      | 10100: Realtime-Realtime Sprite (point)                                     |
|                                |          |      | 101: Realtime Line                                                          |
|                                |          |      | 110: Realtime Tri_rect101: Line AA 110: Point AA (Sprite)                   |

| Name               | Bits | Description                                                                   | ĺ |
|--------------------|------|-------------------------------------------------------------------------------|---|
| SQ_SC_free_buff    | 1    | Pipelined bit that instructs SC to decrement count of buffers in use.         | ı |
| SQ_SC_dec_cntr_cnt | 1    | Pipelined bit that instructs SC to decrement count of new vector and/or event | ĺ |
|                    |      | sent to prevent SC from overflowing SQ interpolator/Reservation request fifo. | ĺ |

The scan converter will submit a partial vector whenever:

- 1.) He gets a primitive marked with an end of packet signal.
- 2.) A current pixel vector is being assembled with at least one or more valid quads and the vector has been marked for deallocate when a primitive marked new\_vector arrives. The Scan Converter will submit a partial vector (up to 16quads with zero pixel mask to fill out the vector) prior to submitting the new\_vector marker\primitive.

(This will prevent a hang which can be demonstrated when all primitives in a packet three vectors are culled except for a one quad primitive that gets marked pc\_dealloc (vertices maximum size). In this case two new\_vectors are submitted and processed, but then one valid quad with the pc\_dealloc creates a vector and then



EDIT DATE
4 September, 201515

R400 Sequencer Specification

PAGE 42 of 54

the new would wait for another vertex vector to be processed, but the one being waited for could never export until the pc\_dealloc signal made it through and thus the hang.)

### 24.2.323.2.3 SQ to SX(SP): Interpolator bus

| Name                                     | Direction      | Bits | Description                                  |
|------------------------------------------|----------------|------|----------------------------------------------|
| SQ_SPXx_interp_flat_vtx                  | SQ→SPx         | 2    | Provoking vertex for flat shading            |
| SQ_S <u>P</u> Xx_interp_flat_gourau<br>d | SQ→SPx         | 1    | Flat or gouraud shading                      |
| SQ_SXxSPx_interp_cyl_wrap                | SQ→SPx         | 4    | Wich channel needs to be cylindrical wrapped |
| SQ_SXx_pc_ptr0                           | SQ→SXx         | 11   | Parameter Cache Pointer                      |
| SQ_SXx_pc_ptr1                           | SQ→SXx         | 11   | Parameter Cache Pointer                      |
| SQ_SXx_pc_ptr2                           | SQ→SXx         | 11   | Parameter Cache Pointer                      |
| SQ_SXx_rt_sel                            | SQ→SXx         | 1    | Selects between RT and Normal data           |
| SQ_SXx_pc_wr_en                          | SQ→SXx         | 1    | Write enable for the PC memories             |
| SQ_SXx_pc_wr_addr                        | SQ→SXx         | 7    | Write address for the PCs                    |
| SQ_SXx_pc_channel_mask                   | SQ→SXx         | 4    | Channel mask                                 |
| SQ SXx pc ptr valid                      | <u>SQ</u> →SXx | 1    | Read pointers are valid.                     |
| SQ SPx interp valid                      | SQ→SPx         | 1    | Interpolation control valid                  |

#### 24.2.423.2.4 SQ to SP: Staging Register Data

This is a broadcast bus that sends the VSISR information to the staging registers of the shader pipes.

| Name               | Direction | Bits | Description                                            |
|--------------------|-----------|------|--------------------------------------------------------|
| SQ_SPx_vsr_data    | SQ→SPx    | 96   | Pointers of indexes or HOS surface information         |
| SQ_SPx_vsr_double  | SQ→SPx    | 1    | 0: Normal 96 bits per vert 1: double 192 bits per vert |
| SQ_SP0_ vsr_valid  | SQ→SP0    | 1    | Data is valid                                          |
| SQ_SP1_ vsr_ valid | SQ→SP1    | 1    | Data is valid                                          |
| SQ_SP2_vsr_valid   | SQ→SP2    | 1    | Data is valid                                          |
| SQ_SP3_vsr_valid   | SQ→SP3    | 1    | Data is valid                                          |
| SQ_SPx_vsr_read    | SQ→SPx    | 1    | Increment the read pointers                            |

### 24.2.523.2.5 VGT to SQ: Vertex interface

### 24.2.5.123.2.5.1 Interface Signal Table

The area difference between the two methods is not sufficient to warrant complicating the interface or the state requirements of the VSISRs. Therefore, the POR for this interface is that the VGT will transmit the data to the VSISRs (via the Shader Sequencer) in full, 32-bit floating-point format. The VGT can transmit up to six 32-bit floating-point values to each VSISR where four or more values require two transmission clocks. The data bus is 96 bits wide.

| Name Bits                         |    | Description                                                                         |  |  |
|-----------------------------------|----|-------------------------------------------------------------------------------------|--|--|
| VGT_SQ_vsisr_data                 | 96 | Pointers of indexes or HOS surface information                                      |  |  |
| VGT SQ event                      | 1  | VGT is sending an event                                                             |  |  |
| VGT_SQ_vsisr_doublecontinu        | 1  | 0: Normal 96 bits per vert 1: double 192 bits per vert                              |  |  |
| ed                                |    |                                                                                     |  |  |
| VGT_SQ_end_of_vector <u>vtx_v</u> | 1  | Indicates the last VSISR data set for the current process vector (for double vector |  |  |
| ect                               |    | data, "end_of_vector" is set on the first vector)                                   |  |  |
| VGT_SQ_indx_valid                 | 1  | Vsisr data is valid                                                                 |  |  |
| VGT_SQ_state                      | 3  | Render State (6*3+3 for constants). This signal is guaranteed to be correct when    |  |  |
|                                   |    | "VGT_SQ_vgt_end_of_vector" is high.                                                 |  |  |
| VGT_SQ_send 1                     |    | Data on the VGT_SQ is valid receive (see write-up for standard R400 SEND/RTR        |  |  |
|                                   |    | interface handshaking)                                                              |  |  |
| SQ_VGT_rtr                        | 1  | Ready to receive (see write-up for standard R400 SEND/RTR interface                 |  |  |
|                                   |    | handshaking)                                                                        |  |  |

### 24.2.5.223.2.5.2 Interface Diagrams

Exhibit 2031.docR400\_Sequencer.doc 71818 Bytes\*\*\* © ATI Confidential. Reference Copyright Notice on Cover Page © \*\*\*

Formatted: Bullets and Numbering

Formatted: Bullets and Numbering

Formatted: Bullets and Numbering

Formatted: Bullets and Numbering

Exhibit 2031 doc R400\_Sequencerdee 71818 Bytes\*\*\* © ATI Confidential. Reference Copyright Notice on Cover Page © \*\*\*



Figure 1. Detailed Logical Diagram for PA SQ vgt Interface.

EMINIC 2031 doc R400\_Sequencer doc 71818 Bytes 4\*\*\* © ATI Confidential. Reference Copyright Notice on Cover Page © \*\*\*



4 September, 201545

DOCUMENT-REV. NUM. GEN-CXXXXX-REVA PAGE 45 of 54

24.2.623.2.6 SQ to SX: Control bus

Name Direction Description SQ\_SXx\_exp\_type SQ→SXx 00: Pixel without z (1 to 4 buffers) 01: Pixel with z (1 to 4 buffers) 10: Position (1 or 2 results) 11: Pass thru (4,8 or 12 results aligned) SQ\_SXx\_exp\_number SQ→SXx 2 Number of locations needed in the export buffer (encoding depends on the type see bellow). SQ→SXx SQ\_SXx\_exp\_alu\_id ALU ID SQ→SXx SQ\_SXx\_exp\_valid 1 Valid bit SQ\_SXx\_exp\_state SQ→SXx 3 State Context SQ SXx free done SQ→SXx Pulse to indicate that the previous export is finished (this can be sent with or without the other fields of the interface) SQ\_SXx\_free\_alu\_id SQ→SXx ALU ID

Depending on the type the number of export location changes:

- Type 00 : Pixels without Z
  - o 00 = 1 buffer
  - o 01 = 2 buffers
  - o 10 = 3 buffers
  - o 11 = 4 buffer
- Type 01: Pixels with Z
  - 00 = 2 Buffers (color + Z)
  - 01 = 3 buffers (2 color + Z)
  - 10 = 4 buffers (3 color + Z)
  - o 11 = 5 buffers (4 color + Z)
- Type 10 : Position export
  - o 00 = 1 position
  - o 01 = 2 positions
  - o 1X = Undefined
- Type 11: Pass Thru
  - o 00 = 4 buffers
  - o 01 = 8 buffers
  - 10 = 12 buffers
     11 = Undefined

Below the thick black line is the end of transfer packet that tells the SX that a given export is finished. The report packet will always arrive either before or at the same time than the next export to the same ALU id.

#### 24.2.723.2.7 SX to SQ: Output file control

| Name                 | Direction | Bits | Description                                                                                                                                                                                                                           |
|----------------------|-----------|------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| SXx_SQ_exp_count_rdy | SXx→SQ    | 1    | Raised by SX0 to indicate that the following two fields reflect the result of the most recent export                                                                                                                                  |
| SXx_SQ_exp_pos_avail | SXx→SQ    | 1    | Specifies whether there is room for another position.                                                                                                                                                                                 |
| SXx_SQ_exp_buf_avail | SXx→SQ    | 7    | Specifies the space available in the output buffers.  0: buffers are full  1: 2K-bits available (32-bits for each of the 64 pixels in a clause)  64: 128K-bits available (16 128-bit entries for each of 64 pixels)  65-127: RESERVED |

Exhibit 2031.doc R400\_Sequencer.doc 71818 Bytes\*\*\* © ATI Confidential. Reference Copyright Notice on Cover Page © \*\*\*

Formatted: Bullets and Numbering

Formatted: Bullets and Numbering



EDIT DATE
4 September, 201515

R400 Sequencer Specification

PAGE 46 of 54

Formatted: Bullets and Numbering

Formatted: Bullets and Numbering

#### 24.2.823.2.8 SQ to TP: Control bus

Once every clock, the fetch unit sends to the sequencer on which RS line it is now working and if the data in the GPRs is ready or not. This way the sequencer can update the fetch valid bits flags for the reservation station. The sequencer also provides the instruction and constants for the fetch to execute and the address in the register file where to write the fetch return data.

| Name                   | Direction | Bits | Description                                               |
|------------------------|-----------|------|-----------------------------------------------------------|
| TPx_SQ_data_rdy        | TPx→ SQ   | 1    | Data ready                                                |
| TPx_SQ_rs_line_num     | TPx→ SQ   | 6    | Line number in the Reservation station                    |
| TPx_SQ_type            | TPx→ SQ   | 1    | Type of data sent (0:PIXEL, 1:VERTEX)                     |
| SQ_TPx_send            | SQ→TPx    | 1    | Sending valid data                                        |
| SQ_TPx_const           | SQ→TPx    | 48   | Fetch state sent over 4 clocks (192 bits total)           |
| SQ_TPx_instr           | SQ→TPx    | 24   | Fetch instruction sent over 4 clocks                      |
| SQ_TPx_end_of_group    | SQ→TPx    | 1    | Last instruction of the group                             |
| SQ_TPx_Type            | SQ→TPx    | 1    | Type of data sent (0:PIXEL, 1:VERTEX)                     |
| SQ_TPx_gpr_phase       | SQ→TPx    | 2    | Write phase signal                                        |
| SQ_TP0_lod_correct     | SQ→TP0    | 6    | LOD correct 3 bits per comp 2 components per quad         |
| SQ_TP0_pix_mask        | SQ→TP0    | 4    | Pixel mask 1 bit per pixel                                |
| SQ_TP1_lod_correct     | SQ→TP1    | 6    | LOD correct 3 bits per comp 2 components per quad         |
| SQ_TP1_pix_mask        | SQ→TP1    | 4    | Pixel mask 1 bit per pixel                                |
| SQ_TP2_lod_correct     | SQ→TP2    | 6    | LOD correct 3 bits per comp 2 components per quad         |
| SQ_TP2_pix_mask        | SQ→TP2    | 4    | Pixel mask 1 bit per pixel                                |
| SQ_TP3_lod_correct     | SQ→TP3    | 6    | LOD correct 3 bits per comp 2 components per quad         |
| SQ_TP3_pix_mask        | SQ→TP3    | 4    | Pixel mask 1 bit per pixel                                |
| SQ_TPx_rs_line_num     | SQ→TPx    | 6    | Line number in the Reservation station                    |
| SQ_TPx_write_gpr_index | SQ->TPx   | 7    | Index into Register file for write of returned Fetch Data |

### 24.2.923.2.9 TP to SQ: Texture stall

The TP sends this signal to the SQ and the SPs when its input buffer is full.



| Name              | Direction | Bits | Description                                  |
|-------------------|-----------|------|----------------------------------------------|
| TP SQ fetch stall | TP→ SQ    | 1    | Do not send more texture request if asserted |

Exhibit 2031.doc R400\_Sequencer.doc 71818 Bytes\*\*\*\* © ATI Confidential. Reference Copyright Notice on Cover Page © \*\*\*



EDIT DATE
4 September, 201515

DOCUMENT-REV. NUM. PAGE
GEN-CXXXXX-REVA 47 of 54

Formatted: Bullets and Numbering

24.2.1023.2.10 SQ to SP: Texture stall

| Name               | Direction | Bits | Description                                  |
|--------------------|-----------|------|----------------------------------------------|
| SQ_SPx_fetch_stall | SQ→SPx    | 1    | Do not send more texture request if asserted |

### 24.2.1123.2.11 SQ to SP: GPR and auto counter

| Name                 | Direction | Bits | Description                                                                                                                      |
|----------------------|-----------|------|----------------------------------------------------------------------------------------------------------------------------------|
| SQ_SPx_gpr_wr_addr   | SQ→SPx    | 7    | Write address                                                                                                                    |
| SQ_SPx_gpr_rd_addr   | SQ→SPx    | 7    | Read address                                                                                                                     |
| SQ_SPx_gpr_rd_en     | SQ→SPx    | 1    | Read Enable                                                                                                                      |
| SQ_SP0_gpr_wr_en     | SQ→SPx    | 1    | Write Enable for the GPRs of SP0                                                                                                 |
| SQ_SP1_gpr_wr_en     | SQ→SPx    | 1    | Write Enable for the GPRs of SP1                                                                                                 |
| SQ_SP2_gpr_wr_en     | SQ→SPx    | 1    | Write Enable for the GPRs of SP2                                                                                                 |
| SQ_SP3_gpr_wr_en     | SQ→SPx    | 1    | Write Enable for the GPRs of SP3                                                                                                 |
| SQ_SPx_gpr_phase     | SQ→SPx    | 2    | The phase mux (arbitrates between inputs, ALU SRC reads and writes)                                                              |
| SQ_SPx_channel_mask  | SQ→SPx    | 4    | The channel mask                                                                                                                 |
| SQ_SPx_gpr_input_sel | SQ→SPx    | 2    | When the phase mux selects the inputs this tells from which source to read from: Interpolated data, VTX0, VTX1, autogen counter. |
| SQ_SPx_auto_count    | SQ→SPx    | 12?  | Auto count generated by the SQ, common for all shader pipes                                                                      |

Formatted: Bullets and Numbering



EDIT DATE
4 September, 201515

R400 Sequencer Specification

PAGE

24.2.1223.2.12 SQ to SPx: Instructions

48 of 54 Formatted: Bullets and Numbering

| Name                  | Direction     | Bits         | Description                                                                                                                                                     |
|-----------------------|---------------|--------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------|
| SQ_SPx_instr_start    | SQ→SPx        | 1            | Instruction start                                                                                                                                               |
| SQ_SP_instr           | SQ→SPx        | 2 <u>2</u> 4 | Transferred over 4 cycles  0: SRC A Select                                                                                                                      |
|                       |               |              | 1: SRC B Select 2:0     SRC B Argument Modifier 3:3     SRC B swizzle 11:4     ScalarDst 17:12     Per channel use mask (PV/Reg) 21:18Unused 20:18              |
|                       |               |              | 2: SRC C Select2:0 SRC C Argument Modifier 3:3 SRC C swizzle11:4 Per channel use mask (PV/Reg) 21:18Unused 20:12                                                |
|                       |               |              | - 3: Vector Opcode 4:0 Scalar Opcode 10:5 Vector Clamp 11:11 Scalar Clamp 12:12 Vector Write Mask 16:13 Scalar Write Mask 20:17                                 |
| SQ_SPx_exp_alu_id     | SQ→SPx        | 1            | ALU ID                                                                                                                                                          |
| SQ_SPx_exporting      | SQ→SPx        | 2            | 0: Not Exporting 1: Vector Exporting 2: Scalar Exporting                                                                                                        |
| SQ_SPx_stall          | SQ→SPx        | 1            | Stall signal                                                                                                                                                    |
| SQ_SP0_write_mask     | SQ→SP0        | 4            | Result of pixel kill in the shader pipe, which must be output for all pixel exports (depth and all color buffers). 4x4 because 16 pixels are computed per clock |
| SQ_SP1_ write_mask    | SQ→SP1        | 4            | Result of pixel kill in the shader pipe, which must be output for all pixel exports (depth and all color buffers). 4x4 because 16 pixels are computed per clock |
| SQ_SP2_ write_mask    | SQ→SP2        | 4            | Result of pixel kill in the shader pipe, which must be output for all pixel exports (depth and all color buffers). 4x4 because 16 pixels are computed per clock |
| SQ_SP3_ write_mask    | SQ→SP3        | 4            | Result of pixel kill in the shader pipe, which must be output for all pixel exports (depth and all color buffers). 4x4 because 16 pixels are computed per clock |
| SQ_SPx_last           | SQ→SPx        | 1            | Last instruction of the block                                                                                                                                   |
| SQ SP0 pred overwrite | <u>SQ→SP0</u> | 4            | Indicates to overwrite the use of PV/PS because of the predication (use the GPRs instead). This operation is done on a per-pixel basis.                         |
| SQ SP1 pred overwrite | SQ→SP1        | 4            | Indicates to overwrite the use of PV/PS because of                                                                                                              |

| 2 A                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            | ORIGINATE                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      | DATE                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       | EDIT                                                                                                                                                                         | T DATE                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | DOCUMENT-REV. NUM.                                                                                                                                                                                                                                                              | PAGE                                    |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| YUU                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            | 24 September                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   | er, 2001                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |                                                                                                                                                                              | nber, 2015                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | GEN-CXXXXX-REVA                                                                                                                                                                                                                                                                 | 49 of 54                                |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
|                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            | COUNT VIEW                                                                                                                                                                   | T A BROW                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | the p                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | redication (use the GPRs inst                                                                                                                                                                                                                                                   | ead). This                              |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| SQ SP2 pr                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      | red_overwrite                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  | SQ→                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        | SP2                                                                                                                                                                          | 4                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            | Indica                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | ion is done on a per-pixel basis.  es to overwrite the use of PV/PS                                                                                                                                                                                                             | *************************************** |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
|                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |                                                                                                                                                                              |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | operat                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | redication (use the GPRs instion is done on a per-pixel basis.                                                                                                                                                                                                                  |                                         |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
|                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | red_overwrite                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  | SQ→:                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |                                                                                                                                                                              | 4                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            | the p<br>operat                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | tes to overwrite the use of PV/PS redication (use the GPRs institution is done on a per-pixel basis.                                                                                                                                                                            |                                         | Formatted: Bullets and Numbering                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
|                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | .2.13 SP to                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |                                                                                                                                                                              |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | Predicate Set                                                                                                                                                                                                                                                                   | 4-                                      | To the state of th |
| Name                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | Directio                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |                                                                                                                                                                              |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | Descript                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |                                                                                                                                                                                                                                                                                 |                                         |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| SP0_SQ_co                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | SP0→S0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |                                                                                                                                                                              | to                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           | o the se                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       | address load / predicate vector load quencer                                                                                                                                                                                                                                    | (4 bits only)                           |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| SP0_SQ_va                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | SP0→S0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |                                                                                                                                                                              |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | Data vali                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |                                                                                                                                                                                                                                                                                 | /4 Lite L X                             |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| SP1_SQ_co                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | SP1→S0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |                                                                                                                                                                              | to                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           | o the se                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       | : address load / predicate vector load<br>quencer                                                                                                                                                                                                                               | (4 bits only)                           |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| SP1_SQ_va                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | SP1→S0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |                                                                                                                                                                              |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | Data vali                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |                                                                                                                                                                                                                                                                                 | /4 laika   1 3                          |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| SP2_SQ_co                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | SP2→S0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |                                                                                                                                                                              | to                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           | o the se                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       | address load / predicate vector load quencer                                                                                                                                                                                                                                    | (4 bits only)                           |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| SP2_SQ_va                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | SP2→S0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |                                                                                                                                                                              | ·                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            | Data vali                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |                                                                                                                                                                                                                                                                                 | (4.5:4- 1.)                             |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| SP3_SQ_co                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | SP3→S(                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |                                                                                                                                                                              | to                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           | o the se                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       | : address load / predicate vector load<br>quencer                                                                                                                                                                                                                               | (4 bits only)                           |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| SP3_SQ_va                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | SP3→S0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |                                                                                                                                                                              |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | Data Vali                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |                                                                                                                                                                                                                                                                                 |                                         |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| SP0_SQ_da                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      | ata_type                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       | SP→SQ                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |                                                                                                                                                                              | 0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            | Data Typ<br>D: Consta<br>I: Predic                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             | ant Load                                                                                                                                                                                                                                                                        |                                         |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
|                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | .2.14_SQ to                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |                                                                                                                                                                              |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |                                                                                                                                                                                                                                                                                 | 4                                       | Formatted: Bullets and Numbering                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
| 1.2.1423.  Name  SQ_SPx_co                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | O SPx: C                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   | n                                                                                                                                                                            | Bits C                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       | Descript                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       | ion<br>broadcast                                                                                                                                                                                                                                                                | 4-                                      | (Formatted: Bullets and Numbering                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |
| Name<br>SQ_SPx_co                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | Directio<br>SQ→SP                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | n<br>x                                                                                                                                                                       | Bits   D                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | Descript                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |                                                                                                                                                                                                                                                                                 | 4-                                      | Formatted: Bullets and Numbering  Formatted: Bullets and Numbering                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |
| Name<br>SQ_SPx_co                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | onst                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           | Directio<br>SQ→SP<br>to SQ: F                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | n<br>×<br><ill td="" vector<=""><td>Bits D<br/>128 C</td><td><b>Descrip</b>t<br/>Constant</td><td>broadcast</td><td>4-</td><td></td></ill>                                   | Bits D<br>128 C                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | <b>Descrip</b> t<br>Constant                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   | broadcast                                                                                                                                                                                                                                                                       | 4-                                      |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| Name<br>SQ_SPx_co<br>1.2.1523.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 | .2.15_SP0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      | Directio<br>SQ→SP<br>to SQ: F                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | n<br>×<br><ill td="" vector<=""><td>  Bits   C                                  </td><td>Descript<br/>Constant<br/>Descript</td><td>broadcast</td><td>4-</td><td></td></ill> | Bits   C                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | Descript<br>Constant<br>Descript                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | broadcast                                                                                                                                                                                                                                                                       | 4-                                      |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| Name SQ_SPx_cc  1.2.1523.  Name SP0_SQ_ki                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      | .2.15_SP0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      | Directio SQ→SP  to SQ: F  Directio SP0→S0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  | n<br>x<br>Kill vector<br>n<br>Q                                                                                                                                              | Bits   C                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | Descript Constant Descript Kill vector                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | broadcast<br>ion<br>r load                                                                                                                                                                                                                                                      | 4-                                      |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| Name<br>SQ_SPx_co<br>1.2.1523.<br>Name<br>SP0_SQ_ki<br>SP1_SQ_ki                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | .2.15_SP0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      | Directio<br>SQ→SP<br>to SQ: F                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | n<br>x<br><i>(ill vector</i><br>n<br>Q                                                                                                                                       | Bits   C                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | Descript<br>Constant<br>Descript                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | ion<br>r load<br>r load                                                                                                                                                                                                                                                         | 4                                       |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| Name SQ_SPx_cc 1.2.1523.  Name SP0_SQ_ki                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       | 2.15 SP0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       | Directio<br>  SQ→SP<br>  to SQ: F<br>  Directio<br>  SP0→S0<br>  SP1→S0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    | n<br>x<br>Kill vector<br>n<br>Q<br>Q                                                                                                                                         | Bits C 128 C 10ad Bits C 4 K 4 K 4 K 4 K                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | Descript<br>Constant<br>Descript<br>Kill vector                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | ion r load r load r load                                                                                                                                                                                                                                                        | 4                                       | Formatted: Bullets and Numbering                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
| Name SQ_SPx_cc 1.2.1523  Name SP0_SQ_ki SP1_SQ_ki SP2_SQ_ki SP3_SQ_ki                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | 2.15 SP0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       | Directio SQ→SP  to SQ: h  Directio SP0→S( SP1→S( SP2→S( SP3→S(                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             | n<br>x<br>Kill vector<br>n<br>a<br>a<br>a                                                                                                                                    | Bits   C                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | Descript Descript Kill vecto Kill vecto                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        | ion r load r load r load                                                                                                                                                                                                                                                        | 4-                                      |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| Name SQ_SPx_cc 1.2.1523  Name SP0_SQ_ki SP1_SQ_ki SP2_SQ_ki SP3_SQ_ki                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | 2.15 SP0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       | Directio SQ→SP  to SQ: h  Directio SP0→S( SP1→S( SP2→S( SP3→S(                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             | n<br>x<br>Xill vector<br>n<br>2<br>2<br>2<br>2<br>2<br>8<br>BBM bus                                                                                                          | Bits   C   128   C   C                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       | Descript Descript Kill vecto Kill vecto                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        | ion r load r load r load r load                                                                                                                                                                                                                                                 | 4-                                      | Formatted: Bullets and Numbering                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
| Name SQ_SPx_cc 1.2.1523.  Name SP0_SQ_ki SP1_SQ_ki SP2_SQ_ki SP3_SQ_ki 1.2.1623.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | 2.15 SP0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       | Directio                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   | n<br>x<br>Kill vector<br>n<br>a<br>a<br>a<br>a<br>a<br>b<br>BBM bus                                                                                                          | Bits C 128 C 10ad Bits C 4 K 4 K 4 K 4 K 4 K 4 K 4 K 4 K 4 K 4                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | Descript Descript Descript Kill vecto Kill vecto Kill vecto                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    | ion r load r load r load r load                                                                                                                                                                                                                                                 | 4-                                      | Formatted: Bullets and Numbering                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
| Name SQ_SPx_ct 1.2.1523  Name SP0_SQ_ki SP1_SQ_ki SP2_SQ_ki SP3_SQ_ki 1.2.1623  Name                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           | 2.15 SP0  ill_vect ill_vect ill_vect ct ill_vect 2.16 SQ to                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    | Directio   SQ → SP                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | n<br>x<br>Kill vector<br>n<br>a<br>a<br>a<br>a<br>a<br>BBM bus                                                                                                               | Bits   C                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | Descript Descript Constant Descript Cill vecto Cill vecto Cill vecto Cill vecto                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | ion r load r load r load r load r load r load                                                                                                                                                                                                                                   | 4                                       | Formatted: Bullets and Numbering                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
| Name SQ_SPX_ct 1.2.1523  Name SP0_SQ_ki SP1_SQ_ki SP2_SQ_ki SP3_SQ_ki 1.2.1623  Name SQ_RBB_t                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  | onst  2.15 SP0  ill_vect ill_vect ill_vect 2.16 SQ to                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | Directio     SQ → SP     to SQ: f     Directio     SP0 → S(     SP1 → S(     SP2 → S(     SP3 → S(     Directio     SQ → CP                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | n<br>x<br>Kill vector<br>n<br>a<br>a<br>a<br>a<br>a<br>BBM bus                                                                                                               | Bits   C   128   C   128 | Descript Descript Kill vecto Kill vecto Kill vecto Cill vecto Descript Read St                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 | ion r load                                                                                                                                                                                                                            | 4-                                      | Formatted: Bullets and Numbering                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
| Name SQ_SPx_cr 1.2.1523.  Name SP0_SQ_ki SP1_SQ_ki SP2_SQ_ki 1.2.1623.  Name SQ_RBB_r SQ_RBB_r                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 | onst  2.15 SP0  ill_vect ill_vect ill_vect colors c | Directio   SQ → SP                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | n<br>x<br>Xill vector<br>n<br>2<br>2<br>2<br>2<br>BBM bus                                                                                                                    | Bits   C                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | Descript Constant Descript Kill vector Kill vector Kill vector Kill vector Cill vector Descript Read St Read Da                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | ion r load                                                                                                                                                                                                                            | 4-                                      | Formatted: Bullets and Numbering  Formatted: Bullets and Numbering                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |
| Name SQ_SPX_ct  Name SP0_SQ_ki SP1_SQ_ki SP2_SQ_ki SP3_SQ_ki  .2.1623.  Name SQ_RBB_r SQ_RBBM SQ_RBBM SQ_RBBM                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  | onst  2.15 SP0  ill_vect ill_vect ill_vect colors c | Directio     SQ → SP     to SQ: f     Directio     SP0 → S(     SP1 → S(     SP3 → S(     Directio     SQ → CP     SQ → CP | n<br>x<br>Kill vector<br>n<br>a<br>a<br>a<br>a<br>a<br>BBM bus                                                                                                               | Bits   C                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | Descript Constant Descript Kill vector Kill vector Kill vector Kill vector Cill vector Descript Read St Read Da                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | ion r load r load r load r load r load r load                                                                                                                                                                                                                                   | 4                                       | Formatted: Bullets and Numbering                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
| Name SQ_SPX_ct 1.2.1523  Name SP0_SQ_ki SP1_SQ_ki SP2_SQ_ki SP3_SQ_ki 1.2.1623  Name SQ_RBB_r SQ_RBBM SQ_RBBM                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  | 2.15 SP0  ill_vect ill_vect ill_vect ill_vect cs-cd _nrtrtr _rtr                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | Directio   SQ → SP                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | n<br>x<br>Kill vector<br>n<br>2<br>2<br>2<br>BBM bus<br>n<br>BBM bus                                                                                                         | Bits   C   128   C   128 | Descript Constant Descript Kill vector Kill vector Kill vector Kill vector Cescript Read St Read Da Dptional Real-Tin                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | ion r load                                                                                                                                                                                                              | 4                                       | Formatted: Bullets and Numbering  Formatted: Bullets and Numbering                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |
| Name SQ_SPx_cc  -2.1523.   Name SP0_SQ_ki SP1_SQ_ki SP2_SQ_ki  -2.1623.   Name SQ_RBB_r SQ_RBB_r SQ_RBBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ | 2.15 SP0  ill_vect ill_vect ill_vect ill_vect cs-cd _nrtrtr _rtr                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | Directio   SQ → SP                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | n<br>x<br>Kill vector<br>n<br>a<br>a<br>a<br>a<br>a<br>BBM bus<br>n                                                                                                          | Bits   C   128   C   128 | Descript Constant Descript Kill vector Kill vector Cill vector Cill vector Cescript Read St Read Da Dptional Real-Tin                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | ion r load                                                                                                                                                                                                              | 4                                       | Formatted: Bullets and Numbering  Formatted: Bullets and Numbering                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |
| Name SQ_SPX_cc  1.2.1523.  Name SP0_SQ_ki SP1_SQ_ki SP2_SQ_ki SP3_SQ_ki 1.2.1623.  Name SQ_RBB_r SQ_RBB_R SQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMSQ_RBBMS | 2.15 SP0  ill_vect ill_vect ill_vect ill_vect cs-cd _nrtrtr _rtr                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | Directio   SQ → SP                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | n<br>x<br>Kill vector<br>n<br>a<br>a<br>a<br>a<br>a<br>BBM bus<br>n                                                                                                          | Bits   C                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | Descriptive to the constant of | ion r load                                                                                                                                                                                                              | 4                                       | Formatted: Bullets and Numbering  Formatted: Bullets and Numbering                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |
| Name SQ_SPx_cr 1.2.1523  Name SP0_SQ_ki SP1_SQ_ki SP2_SQ_ki SP3_SQ_ki 1.2.1623  Name SQ_RBB_r SQ_RBBBB SQ_RBBM 1.2.1723  Name rbbm_we                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | 2.15 SP0  ill_vect ill_vect ill_vect ill_vect cs-cd _nrtrtr _rtr                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | Directio   SQ → SP                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | n<br>x<br>Xill vector<br>n<br>2<br>2<br>2<br>2<br>BBM bus<br>n                                                                                                               | Bits   C                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | Descriptionstant  Descriptions | ion r load c load r load r load                                                                                                                                                                                  | 4                                       | Formatted: Bullets and Numbering  Formatted: Bullets and Numbering                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |
| Name SQ_SPx_cr  1.2.1523  Name SP0_SQ_ki SP1_SQ_ki SP2_SQ_ki .2.1623  Name SQ_RBB_r SQ_RBB_R SQ_RBBM .2.1723  Name rbbm_we rbbm_a                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | 2.15 SP0  ill_vect ill_vect ill_vect ill_vect cs-cd _nrtrtr _rtr                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | Directio   SQ → SP                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | n<br>x<br>Xill vector<br>n<br>2<br>2<br>2<br>2<br>3<br>BBM bus<br>n                                                                                                          | Bits   C                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | Descriptionstant  Descriptions | ion r load c load r load  ion robe ata ne (Optional)                                                                                                                                                             | 4                                       | Formatted: Bullets and Numbering  Formatted: Bullets and Numbering                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |
| Name SQ_SPX_cr 1.2.1523  Name SP0_SQ_ki SP1_SQ_ki SP2_SQ_ki SP3_SQ_ki 1.2.1623  Name SQ_RBB_r SQ_RBBBB SQ_RBBM 1.2.1723  Name rbbm_we rbbm_a rbbm_wd                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           | 2.15 SP0  ill_vect ill_vect ill_vect ill_vect cs-cd _nrtrtr _rtr                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | Directio   SQ → SP                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | n<br>x<br>Kill vector<br>n<br>2<br>2<br>2<br>BBM bus<br>n                                                                                                                    | Bits   C   128   C   128 | Descript Constant Descript Cill vector Cil | ion r load cr load cr load r load | 4                                       | Formatted: Bullets and Numbering  Formatted: Bullets and Numbering                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |
| Name SQ_SPx_cr 1.2.1523  Name SP0_SQ_ki SP1_SQ_ki SP3_SQ_ki SP3_SQ_ki 1.2.1623  Name SQ_RBB_r SQ_RBB_r SQ_RBBB_r SQ_RBBB_r SQ_RBBM Q_RBBM DQ_RBBM DQ_RBBM DQ_RBBM DQ_RBBM DQ_RBBM DQ_RBBM DQ_RBBM                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | 2.15 SP0  ill_vect ill_vect ill_vect ill_vect cs-cd _nrtrtr _rtr                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | Directio   SQ → SP                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | n<br>x<br>Kill vector<br>n<br>a<br>a<br>a<br>a<br>BBM bus<br>n                                                                                                               | Bits   C   128   C   128 | Descript Constant Descript Kill vector Kill vector Cill vector Kill vector Cescript Read St Read Da Descript Write Er Address Data Byte Ens Read Re                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            | ion r load cr load r load cr load r load r load sion robe ata ne (Optional)  ion nable Upper Extent is TBD (16:2)  ables nable eturn Strobe 0                                                                                         | 4                                       | Formatted: Bullets and Numbering  Formatted: Bullets and Numbering                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |
| Name SQ_SPX_cc 1.2.1523.  Name SP0_SQ_ki SP1_SQ_ki SP2_SQ_ki SP3_SQ_ki 1.2.1623.  Name SQ_RBB_r SQ_RBB_SQ_RBBM SQ_RBBM SQ_RBBM T2.1723.  Name rbbm_we rbbm_a rbbm_wd rbbm_be rbbm_re                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           | 2.15 SP0  ill_vect ill_vect ill_vect ill_vect cs-cd _nrtrtr _rtr                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | Directio   SQ → SP                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | n<br>x<br>Kill vector<br>n<br>a<br>a<br>a<br>a<br>a<br>BBM bus<br>n                                                                                                          | Bits   C   128   C   128 | Descript Constant Descript Kill vector Kill vector Cill vector Kill vector Cescript Read St Read Da Descript Write Er Address Data Byte Ens Read Re                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            | ion r load sion robe ata  ne (Optional)  ion table Upper Extent is TBD (16:2)  ables table teturn Strobe 0 eturn Strobe 1                                                                                               | 4                                       | Formatted: Bullets and Numbering  Formatted: Bullets and Numbering                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |



| ORIGINATE   | DATE |
|-------------|------|
| 24 Sontombo | 200  |

EDIT DATE
4 September, 201515

R400 Sequencer Specification

PAGE 50 of 54

rbb\_rd1 CP→SQ RBBM\_SQ\_soft\_reset CP→SQ 32 Read Data 0 1 Soft Reset

### 24.2.1823.2.18 SQ to CP: State report

| Name             | Direction | Bits | Description            |
|------------------|-----------|------|------------------------|
| SQ_CP_vs_event   | SQ→CP     | 1    | Vertex Shader Event    |
| SQ_CP_vs_eventid | SQ→CP     | 2    | Vertex Shader Event ID |
| SQ_CP_ps_event   | SQ→CP     | 1    | Pixel Shader Event     |
| SQ_CP_ps_eventid | SQ→CP     | 2    | Pixel Shader Event ID  |

eventid = 0 => \*sEndOfState (i.e. VsEndOfState) eventid = 1 => \*sDone (i.e. VsDone)

So, the CP will assume the Vs is done with a state whenever it gets a pulse on the  $SQ_CP_vs$ \_event and the  $SQ_CP_vs$ \_eventid = 0.

### 24.323.3 Example of control flow program execution

We now provide some examples of execution to better illustrate the new design.

#### Given the program:

Alu 0

Alu 1 Tex 0

Tex 1

Alu 3 Serial

Alu 4

Tex 2

Alu 5 Alu 6 Serial

Tex 3

Alu 7

Alloc Position 1 buffer

Alu 8 Export

Tex 4

Alloc Parameter 3 buffers

Alu 9 Export 0

Tex 5

Alu 10 Serial Export 2

Alu 11 Export 1 End

#### Would be converted into the following CF instructions:

Execute Alu-0 Alu Alu-0 Alu Tex-0 Tex Tex-0 Tex Alu-1 Alu Alu-0 Alu Tex-0 Tex Alu-0 Alu Alu-1 Alu Tex-0 Tex Alu-0 Alu Alu-1 Alu Tex-0 Tex Alu-0 Alu Alu-1 Alu Tex-0 Tex Alu-0 Alu-0 Alu-1 Alu-1

Execute Alu O Alu

Alloc Position 1

Execute Alu 0 Alu Tex 0 Tex

Alloc Param 3

Execute end Alu-O Alu Tex-O Tex Alu-1 Alu Alu-O Alu End

#### And the execution of this program would look like this:

#### Put thread in Vertex RS:

Control Flow Instruction Pointer (12 bits), (CFP) Execution Count Marker (3 or 4 bits), (ECM)

Exhibit 2031.doc R400\_Sequencer.doc 71818 Bytes\*\*\*\* © ATI Confidential. Reference Copyright Notice on Cover Page © \*\*\*

Formatted: Bullets and Numbering

Formatted: Bullets and Numbering



4 September, 201545

DOCUMENT-REV. NUM. GEN-CXXXXX-REVA PAGE 51 of 54

Loop Iterators (4x9 bits), (LI)
Call return pointers (4x12 bits), (CRP)
Predicate Bits(4x64 bits), (PB)
Export ID (1 bit), (EXID)
GPR Base Ptr (8 bits), (GPR)
Export Base Ptr (7 bits), (EB)
Context Ptr (3 bits).(CPTR)
LOD correction bits (16x6 bits) (LOD)

| State Bi | ts  |    |     |    |      |     |    |      | MONOMACKE CONTROL TO SERVICE OF THE |  |
|----------|-----|----|-----|----|------|-----|----|------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--|
| CFP      | ECM | LI | CRP | PB | EXID | GPR | EB | CPTR | LOD                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |  |
| 0        | 0   | 0  | 0   | 0  | 0    | 0   | 0  | 0    | 0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |  |

Valid Thread (VALID)

Texture/ALU engine needed (TYPE)
Texture Reads are outstanding (PENDING)
Waiting on Texture Read to Complete (SERIAL)
Allocation Wait (2 bits) (ALLOC)

00 - No allocation needed

01 - Position export allocation needed (ordered export)

10 - Parameter or pixel export needed (ordered export)

11 - pass thru (out of order export)

Allocation Size (4 bits) (SIZE)
Position Allocated (POS\_ALLOC)
First thread of a new context (FIRST)
Last (1 bit), (LAST)

| Status Bits |      | 400000000000000000000000000000000000000 |        |       |      |           | ELOSEDOMENO DE ELOSEDOMENTO DE |      |
|-------------|------|-----------------------------------------|--------|-------|------|-----------|--------------------------------|------|
| VALID       | TYPE | PENDING                                 | SERIAL | ALLOC | SIZE | POS_ALLOC | FIRST                          | LAST |
| 1           | ALU  | 0                                       | 0      | 0     | 0    | 0         | 1                              | 0    |

Then the thread is picked up for the execution of the first control flow instruction:

Execute 0 Alu 0 Alu 0 Tex 0 Tex 1 Alu 0 Alu 0 Tex 0 Alu 1 Alu 0 Tex Execute Alu 0 Alu 0 Tex 0 Tex 0 Alu 1 Alu 0 Tex 0 Alu 1 Te

It executes the first two ALU instructions and goes back to the RS for a resource request change. Here is the state returned to the RS:

| State Bits |     |    |     |    |      |     |    |      |     |
|------------|-----|----|-----|----|------|-----|----|------|-----|
| CFP        | ECM | LI | CRP | РВ | EXID | GPR | EB | CPTR | LOD |
| 0          | 2   | 0  | 0   | 0  | 0    | 0   | 0  | 0    | 0   |

| Status Bits |      |         |        |       |      |           | W770220MWWWWW77040 |      |  |
|-------------|------|---------|--------|-------|------|-----------|--------------------|------|--|
| VALID       | TYPE | PENDING | SERIAL | ALLOC | SIZE | POS_ALLOC | FIRST              | LAST |  |
| 1           | TFX  | 0       | 0      | 0     | 0    | 0         | 1                  | 0    |  |

Then when the texture pipe frees up, the arbiter picks up the thread to issue the texture reads. The thread comes back in this state:

| State Bits |     |    |     |    |      |     |    |      |     |
|------------|-----|----|-----|----|------|-----|----|------|-----|
| CFP        | ECM | LI | CRP | PB | EXID | GPR | EB | CPTR | LOD |
| 0          | 4   | 0  | 0   | 0  | 0    | 0   | 0  | 0    | 0   |

| Status Bit | ·e   |         |        |       |      |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |       |      | — |
|------------|------|---------|--------|-------|------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------|------|---|
|            |      |         |        |       |      | DESCRIPTION OF THE PROPERTY OF |       |      |   |
| VALID      | TYPE | PENDING | SERIAL | ALLOC | SIZE | POS ALLOC                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      | FIRST | LAST |   |

|   | <b>A</b> B   | ORIGINATE DATE |            | EDIT D              | EDIT DATE |   | R400 Sequencer Specification |   |          |   |
|---|--------------|----------------|------------|---------------------|-----------|---|------------------------------|---|----------|---|
|   | <b>/</b> 'UU | 24 Septer      | nber, 2001 | 4 September, 201515 |           |   |                              |   | 52 of 54 |   |
| Т | 1            | ALU            | 1          | 1                   | 0         | 0 | 0                            | 1 | 0        | Τ |

Because of the serial bit the arbiter must wait for the texture to return and clear the PENDING bit before it can pick the thread up. Lets say that the texture reads are complete, then the arbiter picks up the thread and returns it in this state:

| State Bits |     |    |     |    |      |     |    |      |     |
|------------|-----|----|-----|----|------|-----|----|------|-----|
| CFP        | ECM | LI | CRP | РВ | EXID | GPR | EB | CPTR | LOD |
| 0          | 6   | 0  | 0   | 0  | 0    | 0   | 0  | 0    | 0   |

| Status Bi | ts   |         |        |       |      |           |       |      |  |
|-----------|------|---------|--------|-------|------|-----------|-------|------|--|
| VALID     | TYPE | PENDING | SERIAL | ALLOC | SIZE | POS_ALLOC | FIRST | LAST |  |
| 1         | TEX  | 0       | 0      | 0     | 0    | 0         | 1     | 0    |  |

Again the TP frees up, the arbiter picks up the thread and executes. It returns in this state:

| State Bits |     |    |     |    |      |     |    |      |     |
|------------|-----|----|-----|----|------|-----|----|------|-----|
| CFP        | ECM | LI | CRP | РВ | EXID | GPR | EB | CPTR | LOD |
| 0          | 7   | 0  | 0   | 0  | 0    | 0   | 0  | 0    | 0   |

| Status Bi | te   |         | ************************************** |       |      |           | NOT COMPRESSION OF THE PROPERTY OF THE PROPERT |      |
|-----------|------|---------|----------------------------------------|-------|------|-----------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------|
| VALID     | TYPE | PENDING | SERIAL                                 | ALLOC | SIZE | POS_ALLOC | FIRST                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | LAST |
| 1         | ALU  | 1       | 0                                      | 0     | 0    | 0         | 1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | 0    |

Now, even if the texture has not returned we can still pick up the thread for ALU execution because the serial bit is not set. The thread will however come back to the RS for the second ALU instruction because it has the serial bit set.

| State Bits |     |    |     |    |      |     |    |      |     |
|------------|-----|----|-----|----|------|-----|----|------|-----|
| CFP        | ECM | LI | CRP | РВ | EXID | GPR | EB | CPTR | LOD |
| 0          | 8   | 0  | 0   | 0  | 0    | 0   | 0  | 0    | 0   |

| Status Bits | 3    |         |        |       |      |           |       |      |
|-------------|------|---------|--------|-------|------|-----------|-------|------|
| VALID       | TYPE | PENDING | SERIAL | ALLOC | SIZE | POS_ALLOC | FIRST | LAST |
| 1           | ALU  | 1       | 1      | 0     | 0    | 0         | 1     | 0    |

As soon as the TP clears the pending bit the thread is picked up and returns:

| State Bits |     |    |     |    |      |     |    |      |     |   |
|------------|-----|----|-----|----|------|-----|----|------|-----|---|
| CFP        | ECM | LI | CRP | РВ | EXID | GPR | EB | CPTR | LOD |   |
| 0          | 9   | 0  | 0   | 0  | 0    | 0   | 0  | 0    | 0   | 1 |

| Status Bits |      |         |        |       |      |           |       |      |
|-------------|------|---------|--------|-------|------|-----------|-------|------|
| VALID       | TYPE | PENDING | SERIAL | ALLOC | SIZE | POS ALLOC | FIRST | LAST |
| 1           | TEX  | 0       | 0      | 0     | 0    | 0         | 1     | 0    |



| ORIGINATE   | DATE    |
|-------------|---------|
| 24 Septembe | r. 2001 |

EDIT DATE 4 September, 201515 DOCUMENT-REV. NUM. GEN-CXXXXX-REVA

PAGE 53 of 54

C 1

|     | ******************* |    |     |    |      |     |    |      | and the same of th |
|-----|---------------------|----|-----|----|------|-----|----|------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| CFP | ECM                 | LI | CRP | PB | EXID | GPR | EB | CPTR | LOD                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |
|     | 0                   | 0  | 0   | 0  | 0    | 0   | 0  | 0    | 0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |

Status Bits

| ANNUAL DE L'ARTE DE |      | AND DESCRIPTION OF THE PROPERTY OF THE PROPERT | THE RESERVE OF THE PROPERTY OF | THE RESIDENCE OF THE PARTY OF T | MATERIAL SECTION OF THE PARTY O | NAME OF TAXABLE PARTY OF TAXABLE PARTY. | CONTROL CONTRO |      |
|---------------------------------------------------------------------------------------------------------------|------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------|
| VALID                                                                                                         | TYPE | PENDING                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        | SERIAL                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | ALLOC                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | SIZE                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           | POS_ALLOC                               | FIRST                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | LAST |
| 1                                                                                                             | ALU  | 1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | 0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | 0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | 0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | 0                                       | 1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | 0    |

Picked up by the ALU and returns (lets say the TP has not returned yet):

Alloc Position 1

|     |     | STREET, SAN TO STREET, SOURCE STREET, SAN TO STREET |     |    |      |     | TO SECURE A |      |     |
|-----|-----|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----|----|------|-----|----------------------------------------------------------------------------------------------------------------|------|-----|
| CFP | ECM | LI                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             | CRP | PB | EXID | GPR | EB                                                                                                             | CPTR | LOD |
| 2   | 0   | 0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | 0   | 0  | 0    | 0   | 0                                                                                                              | 0    | 0   |

Status Bits

|       | NATIONAL PROPERTY OF THE PROPE | MINISTER STATEMENT S |        |       | OCTORNO SENSO MASO |           |       |      |
|-------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------|--------|-------|--------------------|-----------|-------|------|
| VALID | TYPE                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           | PENDING                                                                                                        | SERIAL | ALLOC | SIZE               | POS_ALLOC | FIRST | LAST |
| 1     | ALU                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            | 1                                                                                                              | 0      | 01    | 1                  | 0         | 1     | 0    |

If the SX has the place for the export, the SQ is going to allocate and pick up the thread for execution. It returns to the RS in this state:

Execute Alu O Alu Tex O Tex

| State Bits |     |    |     |    |      |     |    |      |     |
|------------|-----|----|-----|----|------|-----|----|------|-----|
| CFP        | ECM | LI | CRP | РВ | EXID | GPR | EB | CPTR | LOD |
| 3          | 1   | 0  | 0   | 0  | 0    | 0   | 0  | 0    | 0   |

| S   | tatus Bits | ************************************** |         | *************************************** |       |      |           |       |      |
|-----|------------|----------------------------------------|---------|-----------------------------------------|-------|------|-----------|-------|------|
| V   | 'ALID      | TYPE                                   | PENDING | SERIAL                                  | ALLOC | SIZE | POS_ALLOC | FIRST | LAST |
| - 1 |            | TEV                                    | 1       | 0                                       | ٥     | 0    | 4         | 4     | 0    |

Now, since the TP has not returned yet, we must wait for it to return because we cannot issue multiple texture requests. The TP returns, clears the PENDING bit and we proceed:

Alloc Param 3

| State Bi |     |    |     |    |      |     |    |      |     |
|----------|-----|----|-----|----|------|-----|----|------|-----|
| CFP      | ECM | LI | CRP | РВ | EXID | GPR | EB | CPTR | LOD |
| 4        | 0   | 0  | 0   | 0  | 1    | 0   | 0  | 0    | 0   |

| Status Bit | :S   | WWW.W.3300W.300W.300W.300W.300W.300W.30 | 9/02/07/25/09/05/07/99/99/05/05/07/99 | 000200100000000000000000000000000000000 |      |           | 1000W107D 1000WWWWWW |      |   |
|------------|------|-----------------------------------------|---------------------------------------|-----------------------------------------|------|-----------|----------------------|------|---|
| VALID      | TYPE | PENDING                                 | SERIAL                                | ALLOC                                   | SIZE | POS_ALLOC | FIRST                | LAST | _ |
| 1          | ALU  | 1                                       | 10                                    | 10                                      | 3    | 1         | 1                    | 10   |   |

Once again the SQ makes sure the SX has enough room in the Parameter cache before it can pick up this thread.

Execute end Alu-0 Alu Tex-0 Tex Alu-1 Alu Alu-0 AluEnd



| ١ | State Dita |     |    |     |    |      |     |     |      | wa  |
|---|------------|-----|----|-----|----|------|-----|-----|------|-----|
| • | CFP        | ECM | LI | CRP | РВ | EXID | GPR | EB  | CPTR | LOD |
|   | 5          | 1   | 0  | 0   | 0  | 1    | 0   | 100 | 0    | 0   |

| Status Bits |      |         |        |       |      |           |       |      |
|-------------|------|---------|--------|-------|------|-----------|-------|------|
| VALID       | TYPE | PENDING | SERIAL | ALLOC | SIZE | POS_ALLOC | FIRST | LAST |
| 1           | TEX  | 1       | 0      | 0     | 0    | 1         | 1     | 0    |

This executes on the TP and then returns:

| State Bits |     |    |     |    |      |     |     |      |     |
|------------|-----|----|-----|----|------|-----|-----|------|-----|
| CFP        | ECM | LI | CRP | PB | EXID | GPR | EB  | CPTR | LOD |
| 5          | 2   | 0  | 0   | 0  | 1    | 0   | 100 | 0    | 0   |

| Status Bi | te   |         | MANAGEMENT STREET, STR | ************************************** | HE COLUMN TO THE PARTY OF THE P |           |       |      |  |
|-----------|------|---------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------|-------|------|--|
| VALID     | TYPE | PENDING | SERIAL                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | ALLOC                                  | SIZE                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           | POS_ALLOC | FIRST | LAST |  |
| 1         | ALU  | 1       | 1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | 0                                      | 0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | 1         | 1     | 1    |  |

Waits for the TP to return because of the textures reads are pending (and SERIAL in this case). Then executes and does not return to the RS because the LAST bit is set. This is the end of this thread and before dropping it on the floor, the SQ notifies the SX of export completion.

### 25.24. Open issues

Need to do some testing on the size of the register file as well as on the register file allocation method (dynamic VS static).

Saving power?

Formatted: Bullets and Numbering

|                          | 24 September, 2001                       | 4 September, 20152           | GEN-CXXXXX-REVA                                                         | 1 of 51  |
|--------------------------|------------------------------------------|------------------------------|-------------------------------------------------------------------------|----------|
| uthor:                   | Laurent Lefebvre                         | August 200215 July           |                                                                         | <u> </u> |
|                          |                                          | LON-                         |                                                                         |          |
| sue To:                  |                                          | Copy No                      | :                                                                       |          |
|                          | _                                        | _                            |                                                                         |          |
|                          | R400 S                                   | equencer Sp                  | ecification                                                             |          |
|                          |                                          | SQ                           |                                                                         |          |
|                          |                                          | Version 2.0 <u>4</u>         |                                                                         |          |
| req                      |                                          | cted uses of the block. It a | ncer block (SEQ). It provides an ovalso describes the block interfaces, |          |
| Document Lo              |                                          |                              | blocks\sq\R400_Sequencer.doc                                            |          |
| Document Le              | ocation: C:\pe                           | Sequencer Specification      | blocks\sq\R400_Sequencer.doc                                            |          |
| Document Le              | ocation: C:\pe                           |                              | blocks\sq\R400_Sequencer.doc<br>Signature/Date                          |          |
| Document Locurrent Intra | ocation: C:\pe<br>net Search Title: R400 | Sequencer Specification      |                                                                         |          |
| Document Le              | ocation: C:\pe<br>net Search Title: R400 | Sequencer Specification      |                                                                         |          |

"Copyright 2001, ATI Technologies Inc. All rights reserved. The material in this document constitutes an unpublished work created in 2001. The use of this copyright notice is intended to provide notice that ATI owns a copyright in this unpublished work. The copyright notice is not an admission that publication has occurred. This work contains confidential, proprietary information and trade secrets of ATI. No part of this document may be used, reproduced, or transmitted in any form or by any means without the prior written permission of ATI Technologies Inc."

Exhibit 2032.docR400\_Sequencer.doc 72136 Bytes\*\*\* © ATI Confidential. Reference Copyright Notice on Cover Page © \*\*\*

ATI 2d32 LG v. ATI IPR2015-00325 AMD1044\_0257556

## **Table Of Contents**

| OVERVIEW                                       |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
|------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|                                                |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
|                                                |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
| Control Graph                                  |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
| INTERPOLATED DATA BUS                          |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
|                                                |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
|                                                |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
|                                                |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
|                                                |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
|                                                |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
|                                                |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
|                                                |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
|                                                |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
| 3.3 Dirty bits                                 |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
| 3.4 Free List Block                            | 17                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |
| 3.5 De-allocate Block                          | 18                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |
| 3.6 Operation of Incremental model             | 18                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |
| Constant Store Indexing                        | 18                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |
| Real Time Commands                             |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
|                                                |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
|                                                |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
|                                                |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
| •                                              |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
|                                                |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
|                                                |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
| t t                                            |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
| · ·                                            |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
| Debugging the Shaders                          |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
| 7.1 Method 1: Debugging registers              | 26                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |
| 7.2 Method 2: Exporting the values in the GPRs | 26                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |
| PIXEL KILL MASK                                |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
| MULTIPASS VERTEX SHADERS (HOS)                 |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
| REGISTER FILE ALLOCATION                       |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
|                                                |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
|                                                |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
|                                                |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
|                                                |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
|                                                |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
|                                                |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
| STAGING REGISTERS                              |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
|                                                | INTERPOLATED DATA BUS. INSTRUCTION STORE SEQUENCER INSTRUCTIONS. CONSTANT STORES. Memory organizations Management of the Control Flow Constants Management of the re-mapping tables. 3.1 R400 Constant management. 3.2 Proposal for R400LE constant management. 3.3 Dirty bits. 3.4 Free List Block. 3.5 De-allocate Block. 3.6 Operation of Incremental model. Constant Store Indexing. Real Time Commands. Constant Waterfalling. LOOPING AND BRANCHES. The control flow Program. 2.1 Control flow instructions table. Implementation. Data dependant predicate instructions. HW Detection of PV,PS. Register file indexing. Debugging the Shaders. 7.1 Method 1: Debugging registers 7.2 Method 2: Exporting the values in the GPRs. PIXEL KILL MASK. MULTIPASS VERTEX SHADERS (HOS). REGISTER FILE ALLOCATION. FETCH ARBITRATION. ALU ARBITRATION. ALU ARBITRATION. HANDLING STALLS. CONTENT OF THE RESERVATION STATION FIFOS. THE OUTPUT FILE. IJ FORMAT. Interpolation of constant attributes. |

|                    |       | ONIGINATE DATE                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 | LDII DAIL                              | DOCOMENT-NEV. NOW.                                     | FAGL                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
|--------------------|-------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------|--------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| 60                 | JU    | 24 September, 2001                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             | 4 September, 20152                     | GEN-CXXXXX-REVA                                        | 3 of 51                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |
| 17.                |       | 'ARAMETER CACHE                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |                                        |                                                        |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| 17.1               | •     |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |                                        |                                                        |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
|                    | .1.1  |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |                                        |                                                        |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
|                    | .1.2  | •                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |                                        |                                                        |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| 17.2               |       | •                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |                                        |                                                        |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| 18.                |       |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |                                        |                                                        |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| 18.1               |       | _                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |                                        |                                                        |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| 18.2               | Pixe  | I Shading                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      | ************************************** |                                                        | 33                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |
| <b>19.</b> 19.1    |       |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |                                        |                                                        |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| 19.2               |       |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |                                        |                                                        |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| 19.3               |       |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |                                        |                                                        |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| 19                 | .3.1  | Vertex shaders                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 |                                        |                                                        | 34                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |
| 19                 | .3.2  |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |                                        |                                                        |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| 20.                |       |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |                                        |                                                        |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| 20.1<br><b>21.</b> |       |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |                                        |                                                        |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| 21.1               |       |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |                                        |                                                        |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| 22.                | REGIS | STERS                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |                                        |                                                        | 35                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |
| 22.1               |       |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |                                        | fined.Error! Bookmark not d                            |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| 22.2<br><b>23.</b> |       |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |                                        | efined.Error! Bookmark not d<br>T DEFINED.ERROR! BOOKM |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
|                    | NED.3 |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | CON: BOOKWARK NO                       | T DEI INED. ERRORI BOOKIN                              | 414141401                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
| 23.1               |       |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | ror! Bookmark not de                   | efined.Error! Bookmark not d                           | efined.35                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
| 23.2               |       | ***************************************                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |                                        | efined.Error! Bookmark not d                           | ACCESSORIAN DOCKERSORIAN DOCKERSORIAN DOCKERSORIAN DE LA CONTRACTORIAN DE LA CONTRACTORIA DEL CONTRACTORIA DE LA CONTRACTORIA D |
| 24.                |       |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |                                        |                                                        | ***************************************                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |
| 24.1               |       |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |                                        |                                                        |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| 24.2               |       |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |                                        |                                                        |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
|                    | .2.1  | MODEL MANAGEMENT AND ADMINISTRATION OF THE PROPERTY OF THE PRO |                                        |                                                        |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
|                    | .2.2  | NAME OF THE PARTY  |                                        |                                                        |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
|                    |       | •                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |                                        |                                                        |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
|                    | .2.4  |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |                                        |                                                        |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| 24                 | .2.5  | VGT to SQ : Vertex in                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | iterface                               |                                                        | <u>39</u> 38                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |
| 24                 | .2.6  | SQ to SX: Control bus                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | 3                                      |                                                        | <u>42</u> 41                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |
| 24                 | .2.7  | SX to SQ : Output file                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | control                                |                                                        | <u>42</u> 41                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |
| 24                 | .2.8  | SQ to TP: Control bus                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | 3                                      |                                                        | <u>43</u> 4 <del>2</del>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |
| 24                 | .2.9  | TP to SQ: Texture sta                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | ıll                                    |                                                        | <u>43</u> 42                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |
| 24                 | .2.10 | SQ to SP: Texture sta                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | all                                    |                                                        | <u>44</u> 43                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |
| 24                 | .2.11 | SQ to SP: GPR and a                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            | uto counter                            |                                                        | 4443                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
| 24                 | .2.12 |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |                                        |                                                        |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
|                    |       |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |                                        | Set                                                    | WATER CONTROL OF THE PARTY OF T |

|           | ORIGINATE DATE          | EDIT DATE        | R400 Sequencer Specification | PAGE         |
|-----------|-------------------------|------------------|------------------------------|--------------|
|           | 24 September, 2001      |                  |                              | 4 of 51      |
| 24.2.14   | SQ to SPx: constant     | t broadcast      |                              | <u>46</u> 45 |
| 24.2.15   | SP0 to SQ: Kill vector  | or load          |                              | <u>46</u> 45 |
| 24.2.16   | SQ to CP: RBBM bu       | ıs               |                              | <u>46</u> 45 |
| 24.2.17   | CP to SQ: RBBM bu       | ıs               |                              | <u>46</u> 45 |
| 24.2.18   | SQ to CP: State rep     | ort              |                              | <u>47</u> 45 |
| 24.3 Exar | mple of control flow pr | rogram execution |                              | 4746         |
| 25 ODEN   |                         |                  |                              | E4E0         |

ot 200215 Jul

DOCONLINI-ILLY, NOW. **GEN-CXXXXX-REVA** 

FAGL 5 of 51

## **Revision Changes:**

Rev 0.1 (Laurent Lefebvre)

Date: May 7, 2001

Rev 0.2 (Laurent Lefebvre)

Date: July 9, 2001 Rev 0.3 (Laurent Lefebvre) Date: August 6, 2001

Rev 0.4 (Laurent Lefebvre) Date: August 24, 2001

Rev 0.5 (Laurent Lefebvre) Date: September 7, 2001 Rev 0.6 (Laurent Lefebvre) Date: September 24, 2001 Rev 0.7 (Laurent Lefebvre) Date: October 5, 2001

Rev 0.8 (Laurent Lefebvre) Date: October 8, 2001 Rev 0.9 (Laurent Lefebvre) Date: October 17, 2001

Rev 1.0 (Laurent Lefebvre) Date: October 19, 2001 Rev 1.1 (Laurent Lefebvre) Date: October 26, 2001

Rev 1.2 (Laurent Lefebvre) Date: November 16, 2001 Rev 1.3 (Laurent Lefebvre) Date: November 26, 2001 Rev 1.4 (Laurent Lefebvre) Date: December 6, 2001

Rev 1.5 (Laurent Lefebvre) Date: December 11, 2001

Rev 1.6 (Laurent Lefebvre) Date: January 7, 2002

Rev 1.7 (Laurent Lefebvre) Date: February 4, 2002 Rev 1.8 (Laurent Lefebvre) Date: March 4, 2002

Rev 1.9 (Laurent Lefebvre) Date: March 18, 2002 Rev 1.10 (Laurent Lefebvre) Date: March 25, 2002 Rev 1.11 (Laurent Lefebvre) Date: April 19, 2002 Rev 2.0 (Laurent Lefebvre)

Date: April 19, 2002

First draft.

Changed the interfaces to reflect the changes in the SP. Added some details in the arbitration section. Reviewed the Sequencer spec after the meeting on

August 3, 2001.

Added the dynamic allocation method for register file and an example (written in part by Vic) of the flow of pixels/vertices in the sequencer.

Added timing diagrams (Vic)

Changed the spec to reflect the new R400 architecture. Added interfaces.

Added constant store management, instruction store management, control flow management and data dependant predication.

Changed the control flow method to be more flexible. Also updated the external interfaces.

Incorporated changes made in the 10/18/01 control flow meeting. Added a NOP instruction, removed the conditional execute or jump. Added debug registers.

Refined interfaces to RB. Added state registers.

Added SEQ→SP0 interfaces. Changed delta precision. Changed VGT-SP0 interface. Debug Methods added.

Interfaces greatly refined. Cleaned up the spec.

Added the different interpolation modes.

Added the auto incrementing counters. Changed the VGT-SQ interface. Added content on constant management. Updated GPRs.

Removed from the spec all interfaces that weren't directly tied to the SQ. Added explanations on constant management. Added PA→SQ synchronization fields and explanation.

Added more details on the staging register. Added detail about the parameter caches. Changed the call instruction to a Conditionnal\_call instruction. Added details on constant management and updated the diagram.

Added Real Time parameter control in the SX interface. Updated the control flow section.

New interfaces to the SX block. Added the end of clause modifier, removed the end of clause instructions.

Rearangement of the CF instruction bits in order to ensure byte alignement.

Updated the interfaces and added a section on exporting rules.

Added CP state report interface. Last version of the

spec with the old control flow scheme

New control flow scheme



4 September, 20152 August 200215 July

R400 Sequencer Specification

6 of 51

Rev 2.01 (Laurent Lefebvre) Date: May 2, 2002 Rev 2.02 (Laurent Lefebvre)

Date: May 13, 2002

Rev 2.03 (Laurent Lefebvre) Date: July 15, 2002

Rev 2.04 (Laurent Lefebvre) Date: August 2, 2002

Changed slightly the control flow instructions to allow force jumps and calls.

Updated the Opcodes. Added type field to the constant/pred interface. Added Last field to the SQ→SP instruction load interface.

SP interface updated to include predication optimizations. Added the predicate no stall instructions,

Documented the new parameter generation scheme for XY coordinates points and lines STs.



24 September, 2001

4 September, 20152 ot 200215 July

DOCUMENT INEV. NOM. **GEN-CXXXXX-REVA** 

7 of 51

## 1. Overview

The sequencer chooses two ALU threads and a fetch hread to execute, and executes all of the instructions in a block before looking for a new clause of the same type. Two ALU threads are executed interleaved to hide the ALU latency. The arbitrator will give priority to older threads. There are two separate reservation stations, one for pixel vectors and one for vertices vectors. This way a pixel can pass a vertex and a vertex can pass a pixel.

To support the shader pipe the sequencer also contains the shader instruction cache, constant store, control flow constants and texture state. The four shader pipes also execute the same instruction thus there is only one sequencer for the whole chip.

The sequencer first arbitrates between vectors of 64 vertices that arrive directly from primitive assembly and vectors of 16 quads (64 pixels) that are generated in the scan converter.

The vertex or pixel program specifies how many GPRs it needs to execute. The sequencer will not start the next vector until the needed space is available in the GPRs.



## 1.1 Top Level Block Diagram



Figure 2: Reservation stations and arbiters

Under this new scheme, the sequencer (SQ) will only use one global state management machine per vector type (pixel, vertex) that we call the reservation station (RS).

Exhibit 2032.docR400\_Sequencer.doc 72136 Bytes\*\*\* © ATI Confidential. Reference Copyright Notice on Cover Page © \*\*\*

4 September, 20152

R400 Sequencer Specification

10 of 51

# 1.2 Data Flow graph (SP)



Figure 3: The shader Pipe

GEN-CXXXXX-REVA

11 of 51

The gray area represents blocks that are replicated 4 times per shader pipe (16 times on the overall chip).

# 1.3 Control Graph



Figure 4: Sequencer Control interfaces

In green is represented the Fetch control interface, in red the ALU control interface, in blue the Interpolated/Vector control interface and in purple is the output file control interface.

## 2. Interpolated data bus

The interpolators contain an IJ buffer to pack the information as much as possible before writing it to the register file.

Exhibit 2032.docR400\_Sequencer.doc 72136 Bytes\*\*\* © ATI Confidential. Reference Copyright Notice on Cover Page © \*\*\*



Figure 5: Interpolation buffers

| <b>×</b> | _ |       |
|----------|---|-------|
|          |   |       |
| N N N    | 2 | 7 4 5 |
| 5        | Y | 1     |
|          | _ |       |
|          | I |       |
| 100      |   |       |
| 1        | I |       |
|          | Ž |       |
|          |   |       |

PAGE

DOCUMENT-REV. NUM.

EDIT DATE

ORIGINATE DATE

| er, 2001 4 September, 20152 GEN-CXXXXX-REVA 13 of 51 | T4 T5    | B1 XY C3 C3 XY D1 D1 D1 | 8        | C5 C5 XY E0 E0 | B0 XY C2 C2 XY C2 C2 | 7<br>A0 B1 C3 D1 A0 B1 C3 D1 6-3 19 35 51 | F A1 C4 D2 C0 A1 C4 D2 C0 A-7 23 39 55 | 7 A2 C5 C1 E0 A2 C5 C1 E0 8- 24- 40- 56-<br>3 H1 27 43 59 | > > >                                    |
|------------------------------------------------------|----------|-------------------------|----------|----------------|----------------------|-------------------------------------------|----------------------------------------|-----------------------------------------------------------|------------------------------------------|
| ptember, 20152                                       | T7 T8    | ဗ                       | × 8<br>8 | ∑<br>∑<br>∑    | C2                   |                                           | 02                                     |                                                           | B0 C2                                    |
|                                                      | T5       | <u>≯</u> 22             |          | 2              | × 88                 | <b>B</b>                                  |                                        |                                                           |                                          |
| 24 September, 2001                                   | T1 T2 T3 | A0 XY B1                | A1 XX    | A2 XY          | BO                   | XY XY XY<br>16- 32- 48-<br>19 35 51       | XY XY XY<br>20- 36- 52-<br>23 39 55    | XY XY XY<br>24- 40- 56-<br>27 43 59                       | XY XY XY<br>28- 44- 60-                  |
| 3                                                    | ТО Т     | A0                      | A<br>A   | A2             | SP<br>3              | ×<br>0-3                                  | SP XY 1                                | SP 2 ≤                                                    | SP × × × × × × × × × × × × × × × × × × × |

Figure 6: Interpolation timing diagram



ORIGINA I E DA I E

24 September, 2001

4 September, 20152 August 200215 July R400 Sequencer Specification

14 of 51

Above is an example of a tile the sequencer might receive from the SC. The write side is how the data get stacked into the XY and IJ buffers, the read side is how the data is passed to the GPRs. The IJ information is packed in the IJ buffer 4 quads at a time or two clocks. The sequencer allows at any given time as many as four quads to interpolate a parameter. They all have to come from the same primitive. Then the sequencer controls the write mask to the GPRs to write the valid data in.

## 3. Instruction Store

There is going to be only one instruction store for the whole chip. It will contain 4096 instructions of 96 bits each.

It is likely to be a 1 port memory; we use 1 clock to load the ALU instruction, 1 clocks to load the Fetch instruction, 1 clock to load 2 control flow instructions and 1 clock to write instructions.

The instruction store is loaded by the CP thru the register mapped registers.

The VS BASE and PS BASE context registers are used to specify for each context where its shader is in the instruction memory.

For the Real time commands the story is quite the same but for some small differences. There are no wrap-around points for real time so the driver must be careful not to overwrite regular shader data. The shared code (shared subroutines) uses the same path as real time.

## 4. Sequencer Instructions

All control flow instructions and move instructions are handled by the sequencer only. The ALUs will perform NOPs during this time (MOV PV,PV, PS,PS) if they have nothing else to do.

## 5. Constant Stores

## 5.1 Memory organizations

A likely size for the ALU constant store is 1024x128 bits. The read BW from the ALU constant store is 128 bits/clock and the write bandwidth is 32 bits/clock (directed by the CP bus size not by memory ports).

The maximum logical size of the constant store for a given shader is 256 constants. Or 512 for the pixel/vertex shader pair. The size of the re-mapping table is 128 lines (each line addresses 4 constants). The write granularity is 4 constants or 512 bits. It takes 16 clocks to write the four constants. Real time requires 256 lines in the physical memory (this is physically register mapped).

The texture state is also kept in a similar memory. The size of this memory is 320x96 bits (128 texture states for regular mode, 32 states for RT). The memory thus holds 128 texture states (192 bits per state). The logical size exposes 32 different states total, which are going to be shared between the pixel and the vertex shader. The size of the re-mapping table to for the texture state memory is 32 lines (each line addresses 1 texture state lines in the real memory). The CP write granularity is 1 texture state lines (or 192 bits). The driver sends 512 bits but the CP ignores the top 320 bits. It thus takes 6 clocks to write the texture state. Real time requires 32 lines in the physical memory (this is physically register mapped).

The control flow constant memory doesn't sit behind a renaming table. It is register mapped and thus the driver must reload its content each time there is a change in the control flow constants. Its size is 320\*32 because it must hold 8 copies of the 32 dwords of control flow constants and the loop construct constants must be aligned.

The constant re-mapping tables for texture state and ALU constants are logically register mapped for regular mode and physically register mapped for RT operation.

Exhibit 2032 docR400\_Sequencer.doc 72136 Bytes\*\*\* @ ATI Confidential. Reference Copyright Notice on Cover Page @ \*\*\*



24 September, 2001

4 September, 20152

GEN-CXXXXX-REVA

15 of 51

# 5.2 Management of the Control Flow Constants

The control flow constants are register mapped, thus the CP writes to the according register to set the constant, the SQ decodes the address and writes to the block pointed by its current base pointer (CF WR BASE). On the read side, one level of indirection is used. A register (SQ\_CONTEXT\_MISC.CF\_RD\_BASE) keeps the current base pointer to the control flow block. This register is copied whenever there is a state change. Should the CP write to CF after the state change, the base register is updated with the (current pointer number +1 )% number of states. This way, if the CP doesn't write to CF the state is going to use the previous CF constants.

## 5.3 Management of the re-mapping tables

### 5.3.1 R400 Constant management

The sequencer is responsible to manage two re-mapping tables (one for the constant store and one for the texture state). On a state change (by the driver), the sequencer will broadside copy the contents of its re-mapping tables to a new one. We have 8 different re-mapping tables we can use concurrently.

The constant memory update will be incremental, the driver only need to update the constants that actually changed between the two state changes.

For this model to work in its simplest form, the requirement is that the physical memory MUST be at least twice as large as the logical address space + the space allocated for Real Time. In our case, since the logical address space is 512 and the reserved RT space can be up to 256 entries, the memory must be of sizes 1280 and above. Similarly the size of the texture store must be of 32\*2+32 = 96 entries and above.

### 5.3.2 Proposal for R400LE constant management

To make this scheme work with only 512+256 = 768 entries, upon reception of a CONTROL packet of state + 1, the sequencer would check for SQ IDLE and PA IDLE and if both are idle will erase the content of state to replace it with the new state (this is depicted in Figure 8: De-allocation mechanismFigure 8: De-allocation mechanismFigure 8: Deallocation mechanism). Note that in the case a state is cleared a value of 0 is written to the corresponding deallocation counter location so that when the SQ is going to report a state change, nothing will be de-allocated upon the first report.

The second path sets all context dirty bits that were used in the current state to 1 (thus allowing the new state to reuse these physical addresses if needed).





Figure 7: Constant management

behind Set State load - 16 clocks) all other Set States just write one entry to current state.

SQ STATE# ADDR **DEALOC** WRITE\_ENABLE COUNTERS CNT VALUE Free List **PREVIOUS** STATE NOT NEW STATE VALUE VALID OR SQ IDLE PA IDLE ◆CP\_NEW\_STATE\_CNTL REMAPPING SET CTX BITS

Figure 8: De-allocation mechanism for R400LE

### 5.3.3 Dirty bits

Two sets of dirty bits will be maintained per logical address. The first one will be set to zero on reset and set when the logical address is addressed. The second one will be set to zero whenever a new context is written and set for each address written while in this context. The reset dirty is not set, then writing to that logical address will not require de-allocation of whatever address stored in the renaming table. If it is set and the context dirty is not set, then the physical address store needs to be de-allocated and a new physical address is necessary to store the incoming data. If they are both set, then the data will be written into the physical address held in the renaming for the current logical address. No de-allocation or allocation takes place. This will happen when the driver does a set constant twice to the same logical address between context changes. NOTE: It is important to detect and prevent this, failure to do it will allow multiple writes to allocate all physical memory and thus hang because a context will not fit for rendering to start and thus free up space.

### 5.3.4 Free List Block

A free list block that would consist of a counter (called the IFC or Initial Free Counter) that would reset to zero and incremented every time a chunk of physical memory is used until they have all been used once. This counter would be checked each time a physical block is needed, and if the original ones have not been used up, us a new one, else check the free list for an available physical block address. The count is the physical address for when getting a chunk from the counter.

Storage of a free list big enough to store all physical block addresses.

**TABLE** 

Maintain three pointers for the free list that are reset to zero. The first one we will call write\_ptr. This pointer will identify the next location to write the physical address of a block to be de-allocated. Note: we can never free more physical memory locations than we have. Once recording address the pointer will be incremented to walk the free list like a ring.

The second pointer will be called stop\_ptr. The stop\_ptr pointer will be advanced by the number of address chunks de-allocates when a context finishes. The address between the stop\_ptr and write\_ptr cannot be reused because they are still in use. But as soon as the context using then is dismissed the stop\_ptr will be advanced.

The third pointer will be called read\_ptr. This pointer will point will point to the next address that can be used for allocation as long as the read\_ptr does not equal the stop\_ptr and the IFC is at its maximum count.

Exhibit 2032 doc R400\_Sequencer.doc 72136 Bytes\*\*\* © ATI Confidential. Reference Copyright Notice on Cover Page © \*\*\*



4 September, 20152

R400 Sequencer Specification

18 of 51

### 5.3.5 De-allocate Block

This block will maintain a free physical address block count for each context. While in current context, a count shall be maintained specifying how many blocks were written into the free list at the write ptr pointer. This count will be reset upon reset or when this context is active on the back and different than the previous context. It is actually a count of blocks in the previous context that will no longer be used. This count will be used to advance the write ptr pointer to make available the set of physical blocks freed when the previous context was done. This allows the discard or de-allocation of any number of blocks in one clock.

### 5.3.6 Operation of Incremental model

The basic operation of the model would start with the write\_ptr, stop\_ptr, read\_ptr pointers in the free list set to zero and the free list counter is set to zero. Also all the dirty bits and the previous context will be initialized to zero. When the first set constants happen, the reset dirty bit will not be set, so we will allocate a physical location from the free list counter because its not at the max value. The data will be written into physical address zero. Both the additional copy of the renaming table and the context zeros of the big renaming table will be updated for the logical address that was written by set start with physical address of 0. This process will be repeated for any logical address that are not dirty until the context changes. If a logical address is hit that has its dirty bits set while in the same context, both dirty bits would be set, so the new data will be over-written to the last physical address assigned for this logical address. When the first draw command of the context is detected, the previous context stored in the additional renaming table will be copied to the larger renaming table in the current (new) context location. Then the set constant logical address with be loaded with a new physical address during the copy and if the reset dirty was set, the physical address it replaced in the renaming table would be entered at the write ptr pointer location on the free list and the write\_ptr will be incremented. The de-allocation counter for the previous context (eight) will be incremented. This as set states come in for this context one of the following will happen:

- 1.) No dirty bits are set for the logical address being updated. A line will be allocated of the free-list counter or the free list at read ptr pointer if read ptr != to stop ptr .
- 2.) Reset dirty set and Context dirty not set. A new physical address is allocated, the physical address in the renaming table is put on the free list at write ptr and it is incremented along with the de-allocate counter for the last context.
- 3.) Context dirty is set then the data will be written into the physical address specified by the logical address.

This process will continue as long as set states arrive. This block will provide backpressure to the CP whenever he has not free list entries available (counter at max and stop\_ptr == read\_ptr). The command stream will keep a count of contexts of constants in use and prevent more than max constants contexts from being sent.

Whenever a draw packet arrives, the content of the re-mapping table is written to the correct re-mapping table for the context number. Also if the next context uses less constants than the current one all exceeding lines are moved to the free list to be de-allocated later. This happens in parallel with the writing of the re-mapping table to the correct memory.

Now preferable when the constant context leaves the last ALU clause it will be sent to this block and compared with the previous context that left. (Init to zero) If they differ than the older context will no longer be referenced and thus can be de-allocated in the physical memory. This is accomplished by adding the number of blocks freed this context to the stop ptr pointer. This will make all the physical addresses used by this context available to the read ptr allocate pointer for future allocation.

This device allows representation of multiple contexts of constants data with N copies of the logical address space. It also allows the second context to be represented as the first set plus some new additional data by just storing the delta's. It allows memory to be efficiently used and when the constants updates are small it can store multiple context. However, if the updates are large, less contexts will be stored and potentially performance will be degraded. Although it will still perform as well as a ring could in this case.

## 5.4 Constant Store Indexing

In order to do constant store indexing, the sequencer must be loaded first with the indexes (that come from the GPRs). There are 144 wires from the exit of the SP to the sequencer (9 bits pointers x 16 vertexes/clock). Since the data must pass thru the Shader pipe for the float to fixed conversion, there is a latency of 4 clocks (1 instruction)

Exhibit 2032.docR400\_Sequencer.doc 72136 Bytes\*\*\* 

ATI Confidential. Reference Copyright Notice on Cover Page 

\*\*\*



24 September, 2001

4 September, 20152

GEN-CXXXXX-REVA

19 of 51

between the time the sequencer is loaded and the time one can index into the constant store. The assembly will look like this

MOVA R1.X,R2.X // Loads the sequencer with the content of R2.X, also copies the content of R2.X into R1.X NOP // latency of the float to fixed conversion ADD R3,R4,C0[R2.X]// Uses the state from the sequencer to add R4 to C0[R2.X] into R3

Note that we don't really care about what is in the brackets because we use the state from the MOVA instruction. R2.X is just written again for the sake of simplicity and coherency.

The storage needed in the sequencer in order to support this feature is 2\*64\*9 bits = 1152 bits.

### 5.5 Real Time Commands

The real time commands constants are written by the CP using the register mapped registers allocated for RT. It works is the same way than when dealing with regular constant loads BUT in this case the CP is not sending a logical address but rather a physical address and the reads are not passing thru the re-mapping table but are directly read from the memory. The boundary between the two zones is defined by the CONST\_EO\_RT control register. Similarly, for the fetch state, the boundary between the two zones is defined by the TSTATE EO RT control register.

## 5.6 Constant Waterfalling

In order to have a reasonable performance in the case of constant store indexing using the address register, we are going to have the possibility of using the physical memory port for read only. This way we can read 1 constant per clock and thus have a worst-case waterfall mode of 1 vertex per clock. There is a small synchronization issue related with this as we need for the SQ to make sure that the constants where actually written to memory (not only sent to the sequencer) before it can allow the first vector of pixels or vertices of the state to go thru the ALUs. To do so, the sequencer keeps 8 bits (one per render state) and sets the bits whenever the last render state is written to memory and clears the bit whenever a state is freed.



Figure 9: The Constant store



ORIGINALE DATE 24 September, 2001

4 September, 20152

R400 Sequencer Specification

20 of 51

Looping and Branches

Loops and branches are planned to be supported and will have to be dealt with at the sequencer level. We plan on supporting constant loops and branches using a control program.

## 6.1 The controlling state.

The R400 controling state consists of:

Boolean[256:0] Loop\_count[7:0][31:0] Loop\_Start[7:0][31:0] Loop\_Step[7:0][31:0]

That is 256 Booleans and 32 loops.

We have a stack of 4 elements for nested calls of subroutines and 4 loop counters to allow for nested loops.

This state is available on a per shader program basis.

### 6.2 The Control Flow Program

We'd like to be able to code up a program of the form:

1: Loop 2: Exec TexFetch 3: **TexFetch** 4: ALU 5: ALU 6: TexFetch 7: End Loop 8: **ALU Export** 

But realize that 3: may be dependent on 2: and 4: is almost certainly dependent on 2: and 3:. Without clausing, these dependencies need to be expressed in the Control Flow instructions. Additionally, without separate 'texture clauses' and 'ALU clauses' we need to know which instructions to dispatch to the Texture Unit and which to the ALU unit. This information will be encapsulated in the flow control instructions.

Each control flow instruction will contain 2 bits of information for each (non-control flow) instruction:

- a) ALU or Texture
- b) Serialize Execution

(b) would force the thread to stop execution at this point (before the instruction is executed) and wait until all textures Given the allocation of reserved bits, this would mean that the count of an 'Exec' instruction would be limited to about 8 (non-control-flow) instructions. If more than this were needed, a second Exec (with the same conditions) would be issued.

Another function that relies upon 'clauses' is allocation and order of execution. We need to assure that pixels and vertices are exported in the correct order (even if not all execution is ordered) and that space in the output buffers are allocated in order. Additionally data can't be exported until space is allocated. A new control flow instruction:

Alloc <buffer select -- position,parameter, pixel or vertex memory. And the size required>.

would be created to mark where such allocation needs to be done. To assure allocation is done in order, the actual allocation for a given thread can not be performed unless the equivalent allocation for all previous threads is already completed. The implementation would also assure that execution of instruction(s) following the serialization due to the Alloc will occur in order -- at least until the next serialization or change from ALU to Texture. In most cases this will allow the exports to occur without any further synchronization. Only 'final' allocations or position allocations are

Exhibit 2032.docR400\_Sequencer.doc 72136 Bytes\*\*\* © ATI Confidential. Reference Copyright Notice on Cover Page © \*\*\*

GEN-CXXXXX-REVA

21 of 51

guaranteed to be ordered. Because strict ordering is required for pixels, parameters and positions, this implies only a single alloc for these structures. Vertex exports to memory do not require ordering during allocation and so multiple 'allocs' may be done.

#### 6.2.1 Control flow instructions table

Here is the revised control flow instruction set.

Note that whenever a field is marked as RESERVED, it is assumed that all the bits of the field are cleared (0).

|       | NOP        |          |  |  |  |  |  |
|-------|------------|----------|--|--|--|--|--|
| 47 44 | 43         | 42 0     |  |  |  |  |  |
| 0000  | Addressing | RESERVED |  |  |  |  |  |

This is a regular NOP.

|       | Execute    |          |                                                |       |              |  |  |  |
|-------|------------|----------|------------------------------------------------|-------|--------------|--|--|--|
| 47 44 | 43         | 40 34    | 3316                                           | 1512  | 11 0         |  |  |  |
| 0001  | Addressing | RESERVED | Instructions type + serialize (9 instructions) | Count | Exec Address |  |  |  |

|       | Execute_End |          |                                  |       |              |  |  |  |
|-------|-------------|----------|----------------------------------|-------|--------------|--|--|--|
| 47 44 | 43          | 40 34    | 3316                             | 1512  | 11 0         |  |  |  |
| 0010  | Addressing  | RESERVED | Instructions type + serialize (9 | Count | Exec Address |  |  |  |
|       |             |          | instructions)                    |       |              |  |  |  |

Execute up to 9 instructions at the specified address in the instruction memory. The Instruction type field tells the sequencer the type of the instruction (LSB) (1 = Texture, 0 = ALU and whether to serialize or not the execution (MSB) (1 = Serialize, 0 = Non-Serialized). If Execute\_End this is the last execution block of the shader program.

| Conditional_Execute |                              |    |                                  |               |              |      |
|---------------------|------------------------------|----|----------------------------------|---------------|--------------|------|
| 47 44               | 43                           | 42 | 41 34                            | 3316          | 1512         | 11 0 |
| 0011                | Addressing Condition Boolean |    | Instructions type + serialize (9 | Count         | Exec Address |      |
|                     |                              |    | address                          | instructions) |              |      |

|     | Conditional_Execute_End          |            |           |         |                                  |       |              |  |
|-----|----------------------------------|------------|-----------|---------|----------------------------------|-------|--------------|--|
| 47  | 47 44 43 42 41 34 3316 1512 11 0 |            |           |         |                                  |       |              |  |
| 010 | 0                                | Addressing | Condition | Boolean | Instructions type + serialize (9 | Count | Exec Address |  |
|     | address instructions)            |            |           |         |                                  |       |              |  |

If the specified Boolean (8 bits can address 256 Booleans) meets the specified condition then execute the specified instructions (up to 9 instructions). If the condition is not met, we go on to the next control flow instruction. If Conditional Execute End and the condition is met, this is the last execution block of the shader program.

| Conditional_Execute_Predicates |            |           |          |                  |                                                |       |              |  |
|--------------------------------|------------|-----------|----------|------------------|------------------------------------------------|-------|--------------|--|
| 47 44                          | 43         | 42        | 41 36    | 35 34            | 3316                                           | 1512  | 11 0         |  |
| 0101                           | Addressing | Condition | RESERVED | Predicate vector | Instructions type + serialize (9 instructions) | Count | Exec Address |  |

|                                        | Conditional_Execute_Predicates_End |           |          |           |                  |       |              |  |  |  |
|----------------------------------------|------------------------------------|-----------|----------|-----------|------------------|-------|--------------|--|--|--|
| 47 44 43 42 41 36 35 34 3316 1512 11 0 |                                    |           |          |           |                  |       |              |  |  |  |
| 0110                                   | Addressing                         | Condition | RESERVED | Predicate | Instructions     | Count | Exec Address |  |  |  |
|                                        |                                    |           |          | vector    | type + serialize |       |              |  |  |  |
|                                        |                                    |           |          |           | (9 instructions) |       |              |  |  |  |

Check the AND/OR of all current predicate bits. If AND/OR matches the condition execute the specified number of instructions. We need to AND/OR this with the kill mask in order not to consider the pixels that aren't valid. If the



4 September, 20152

R400 Sequencer Specification

1

22 of 51

condition is not met, we go on to the next control flow instruction. If Conditional\_Execute\_Predicates\_End and the condition is met, this is the last execution block of the shader program.

| Conditional_Execute_Predicates_No_Stall |            |           |          |                  |                                  |       |              |  |  |
|-----------------------------------------|------------|-----------|----------|------------------|----------------------------------|-------|--------------|--|--|
| 47 44                                   | 43         | 42        | 41 36    | 35 34            | 3316                             | 1512  | 11 0         |  |  |
| 1101                                    | Addressing | Condition | RESERVED | Predicate vector | Instructions<br>type + serialize | Count | Exec Address |  |  |
|                                         |            |           |          |                  | (9 instructions)                 |       |              |  |  |

|       | Conditional_Execute_Predicates_No_Stall_End |           |                  |           |                  |       |              |  |  |  |  |
|-------|---------------------------------------------|-----------|------------------|-----------|------------------|-------|--------------|--|--|--|--|
| 47 44 | 43                                          | 42        | 41 36            | 35 34     | 3316             | 1512  | 11 0         |  |  |  |  |
| 1110  | Addressing                                  | Condition | RESERVED         | Predicate | Instructions     | Count | Exec Address |  |  |  |  |
|       | vector type + se                            |           | type + serialize |           |                  |       |              |  |  |  |  |
|       |                                             |           |                  |           | (9 instructions) |       |              |  |  |  |  |

Same as Conditionnal\_Execute\_Predicates but the SQ is not going to wait for the predicate vector to be updated. You can only set this in the compiler if you know that the predicate set is only a refinement of the current one (like a nested if) because the optimization would still work.

|       | Loop_Start                                |          |         |          |              |  |  |  |
|-------|-------------------------------------------|----------|---------|----------|--------------|--|--|--|
| 47 44 | 4/ 44   43   42 21   20 16   15 12   11 0 |          |         |          |              |  |  |  |
| 0111  | Addressing                                | RESERVED | loop ID | RESERVED | Jump address |  |  |  |

Loop Start. Compares the loop iterator with the end value. If loop condition not met jump to the address. Forward jump only. Also computes the index value. The loop id must match between the start to end, and also indicates which control flow constants should be used with the loop.

|       | Loop_End   |          |                 |         |          |               |  |
|-------|------------|----------|-----------------|---------|----------|---------------|--|
| 47 44 | 43         | 42 24    | 23 21           | 20 16   | 1512     | 11 0          |  |
| 1000  | Addressing | RESERVED | Predicate break | loop ID | RESERVED | start address |  |

Loop end. Increments the counter by one, compares the loop count with the end value. If loop condition met, continue, else, jump BACK to the start of the loop. If predicate break != 0, then compares predicate vector n (specified by predicate break number). If all bits cleared then break the loop.

The way this is described does not prevent nested loops, and the inclusion of the loop id make this easy to do.

|                                 | Conditionnal_Call |           |                 |          |            |              |  |  |
|---------------------------------|-------------------|-----------|-----------------|----------|------------|--------------|--|--|
| 47 44 43 42 41 34 33 13 12 11 0 |                   |           |                 |          |            |              |  |  |
| 1001                            | Addressing        | Condition | Boolean address | RESERVED | Force Call | Jump address |  |  |

If the condition is met, jumps to the specified address and pushes the control flow program counter on the stack. If force call is set the condition is ignored and the call is made always.

|       |            | Return   |
|-------|------------|----------|
| 47 44 | 43         | 42 0     |
| 1010  | Addressing | RESERVED |

Pops the topmost address from the stack and jumps to that address. If nothing is on the stack, the program will just continue to the next instruction.

|       | Conditionnal_Jump                       |  |         |  |  |  |  |  |  |  |  |
|-------|-----------------------------------------|--|---------|--|--|--|--|--|--|--|--|
| 47 44 | 47 44 43 42 41 34 33 32 13 12 11 0      |  |         |  |  |  |  |  |  |  |  |
| 1011  | , , , , , , , , , , , , , , , , , , , , |  |         |  |  |  |  |  |  |  |  |
|       |                                         |  | address |  |  |  |  |  |  |  |  |

If force jump is set the condition is ignored and the jump is made always. If FW only is set then only forward jumps are allowed.

Exhibit 2032.docR400\_Sequencer.doc 72136 Bytes\*\*\* @ ATI Confidential. Reference Copyright Notice on Cover Page @ \*\*\*



4 September, 20152

GEN-CXXXXX-REVA

23 of 51

|       | Allocate |               |          |                 |  |  |  |  |  |  |
|-------|----------|---------------|----------|-----------------|--|--|--|--|--|--|
| 47 44 | 43       | 4241          | 40 4     | 30              |  |  |  |  |  |  |
| 1100  | Debug    | Buffer Select | RESERVED | Allocation size |  |  |  |  |  |  |

Buffer Select takes a value of the following:

01 - position export (ordered export)

10 - parameter cache or pixel export (ordered export)

11 - pass thru (out of order exports).

Buffer Size takes a value of the following:

00 - 1 buffer

01 - 2 buffers

. . .

15 - 16 buffers

If debug is set this is a debug alloc (ignore if debug DB\_ON register is set to off).

#### 6.3 Implementation

The envisioned implementation has a buffer that maintains the state of each thread. A thread lives in a given location in the buffer during its entire life, but the buffer has FIFO qualities in that threads leave in the order that they enter. Actually two buffers are maintained — one for Vertices and one for Pixels. The intended implementation would allow for:

16 entries for vertices

48 entries for pixels.

From each buffer, arbitration logic attempts to select 1 thread for the texture unit and 1 (interleaved) thread for the ALU unit. Once a thread is selected it is read out of the buffer, marked as invalid, and submitted to appropriate execution unit. It is returned to the buffer (at the same place) with its status updated once all possible sequential instructions have been executed. A switch from ALU to TEX or visa-versa or a Serialize\_Execution modifier forces the thread to be returned to the buffer.

Each entry in the buffer will be stored across two physical pieces of memory - most bits will be stored in a 1 read port device. Only bits needed for thread arbitration will be stored in a highly multi-ported structure. The bits kept in the 1 read port device will be termed 'state'. The bits kept in the multi-read ported device will be termed 'status'.

#### 'State Bits' needed include:

- 1. Control Flow Instruction Pointer (13 bits),
- 2. Execution Count Marker 4 bits),
- 3. Loop Iterators (4x9 bits),
- 4. Call return pointers (4x12 bits),
- 5. Predicate Bits (64 bits),
- 6. Export ID (1 bit),
- 7. Parameter Cache base Ptr (7 bits),
- 8. GPR Base Ptr (8 bits),
- 9. Context Ptr (3 bits).
- 10. LOD corrections (6x16 bits)
- 11. Valid bits (64 bits)

Absent from this list are 'Index' pointers. These are costly enough that I'm presuming that they are instead stored in the GPRs. The first seven fields above (Control Flow Ptr, Execution Count, Loop Counts, call return ptrs, Predicate bits, PC base ptr and export ID) are updated every time the thread is returned to the buffer based on how much progress has been mode on thread execution. GPR Base Ptr, Context Ptr and LOD corrections are unchanged throughout execution of the thread.

Exhibit 2032.docR400\_Sequencer.doc 72136 Bytes\*\*\* © ATI Confidential. Reference Copyright Notice on Cover Page © \*\*\*

4 September, 20152

R400 Sequencer Specification

24 of 51

#### 'Status Bits' needed include:

- Valid Thread
- Texture/ALU engine needed
- Texture Reads are outstanding
- Waiting on Texture Read to Complete
- Allocation Wait (2 bits)
- 00 No allocation needed
- 01 Position export allocation needed (ordered export)
- 10 Parameter or pixel export needed (ordered export)
- 11 pass thru (out of order export)
- Allocation Size (4 bits)
- Position Allocated
- · First thread of a new context
- Event thread (NULL thread that needs to trickle down the pipe)
- Last (1 bit)
- Pulse SX (1 bit)

All of the above fields from all of the entries go into the arbitration circuitry. The arbitration circuitry will select a winner for both the Texture Engine and for the ALU engine. There are actually two sets of arbitration -- one for pixels and one for vertices. A final selection is then done between the two. But the rest of this implementation summary only considers the 'first' level selection which is similar for both pixels and vertices.

Texture arbitration requires no allocation or ordering so it is purely based on selecting the 'oldest' thread that requires the Texture Engine.

ALU arbitration is a little more complicated. First, only threads where either of Texture\_Reads\_outstanding or Waiting\_on\_Texture\_Read\_to\_Complete are '0' are considered. Then if Allocation\_Wait is active, these threads are further filtered based on whether space is available. If the allocation is position allocation, then the thread is only considered if all 'older' threads have already done their position allocation (position allocated bits set). If the allocation is parameter or pixel allocation, then the thread is only considered if it is the oldest thread. Also a thread is not considered if it is a parameter or pixel or position allocation, has its First\_thread\_of\_a\_new\_context bit set and would cause ALU interleaving with another thread performing the same parameter or pixel or position allocation. Finally the 'oldest' of the threads that pass through the above filters is selected. If the thread needed to allocate, then at this time the allocation is done, based on Allocation\_Size. If a thread has its "last" bit set, then it is also removed from the buffer, never to return.

If I now redefine 'clauses' to mean 'how many times the thread is removed from the thread buffer for the purpose of exection by either the ALU or Texture engine', then the minimum number of clauses needed is 2 -- one to perform the allocation for exports (execution automatically halts after an 'Alloc' instruction) (but doesn't performs the actual allocation) and one for the actual ALU/export instructions. As the 'Alloc' instruction could be part of a texture clause (presumably the final instruction in such a clause), a thread could still execute in this minimal number of 2 clauses, even if it involved texture fetching.

The Texture\_Reads\_Outstanding bit must be updated by the sequencer, based on keeping track of how many Texture Clauses have been executed by a given thread that have not yet had there data returned. Any number above 0 results in this bit being set. We could consider forcing synchronization such that two texture clauses for a given thread may not be outstanding at any time (that would be my preference for simplicity reasons and because it would require only very little change in the texture pipe interface). This would allow the sequencer to set the bit on execution of the texture clause, and allow the texture unit to return a pointer to the thread buffer on completion that clears the bit.

## 6.4 Data dependant predicate instructions

Data dependant conditionals will be supported in the R400. The only way we plan to support those is by supporting three vector/scalar predicate operations of the form:

Exhibit 2032 docR400\_Sequencer.doc 72136 Bytes\*\*\* © ATI Confidential. Reference Copyright Notice on Cover Page © \*\*\*



4 September, 20152

GEN-CXXXXX-REVA

25 of 51

PRED\_SETE\_# - similar to SETE except that the result is 'exported' to the sequencer.

PRED\_SETNE\_# - similar to SETNE except that the result is 'exported' to the sequencer.

PRED\_SETGT\_# - similar to SETGT except that the result is 'exported' to the sequencer

PRED\_SETGTE\_# - similar to SETGTE except that the result is 'exported' to the sequencer

For the scalar operations only we will also support the two following instructions:

PRED\_SETE0\_# - SETE0 PRED\_SETE1 # - SETE1

The export is a single bit - 1 or 0 that is sent using the same data path as the MOVA instruction. The sequencer will maintain 4 sets of 64 bit predicate vectors (in fact 8 sets because we interleave two programs but only 4 will be exposed) and use it to control the write masking. This predicate is not maintained across clause boundaries. The # sign is used to specify which predicate set you want to use 0 thru 3.

Then we have two conditional execute bits. The first bit is a conditional execute "on" bit and the second bit tells us if we execute on 1 or 0. For example, the instruction:

P0\_ADD\_# R0,R1,R2

Is only going to write the result of the ADD into those GPRs whose predicate bit is 0. Alternatively, P1\_ADD\_# would only write the results to the GPRs whose predicate bit is set. The use of the P0 or P1 without precharging the sequencer with a PRED instruction is undefined.

{Issue: do we have to have a NOP between PRED and the first instruction that uses a predicate?}

#### 6.5 HW Detection of PV.PS

Because of the control program, the compiler cannot detect statically dependant instructions. In the case of non-masked writes and subsequent reads the sequencer will insert uses of PV,PS as needed. This will be done by comparing the read address and the write address of consecutive instructions. For masked writes, the sequencer will insert NOPs wherever there is a dependant read/write.

The sequencer will also have to insert NOPs between PRED SET and MOVA instructions and their uses.

### 6.6 Register file indexing

Because we can have loops in fetch clause, we need to be able to index into the register file in order to retrieve the data created in a fetch clause loop and use it into an ALU clause. The instruction will include the base address for register indexing and the instruction will contain these controls:

| Bit7 | Bit 6 |                     |
|------|-------|---------------------|
| 0    | 0     | 'absolute register' |
| 0    | 1     | 'relative register' |
| 1    | 0     | 'previous vector'   |
| 1    | 1     | 'previous scalar'   |

In the case of an absolute register we just take the address as is. In the case of a relative register read we take the base address and we add to it the loop\_index and this becomes our new address that we give to the shader pipe.

The sequencer is going to keep a loop index computed as such:

Index = Loop\_iterator\*Loop\_step + Loop\_start.

We loop until loop\_iterator = loop\_count. Loop\_step is a signed value [-128...127]. The computed index value is a 10 bit counter that is also signed. Its real range is [-256,256]. The tenth bit is only there so that we can provide an out of range value to the "indexing logic" so that it knows when the provided index is out of range and thus can make the necessary arrangements.

ORIGINA I E DA I E 24 September, 2001

4 September, 20152

R400 Sequencer Specification

26 of 51

## 6.7 Debugging the Shaders

In order to be able to debug the pixel/vertex shaders efficiently, we provide 2 methods.

#### 6.7.1 Method 1: Debugging registers

Current plans are to expose 2 debugging, or error notification, registers:

- 1. address register where the first error occurred
- 2. count of the number of errors

The sequencer will detect the following groups of errors:

- count overflow
- constant indexing overflow
- register indexing overflow

Compiler recognizable errors:

- jump errors
  - relative jump address > size of the control flow program
- call stack

call with stack full return with stack empty

A jump error will always cause the program to break. In this case, a break means that a clause will halt execution, but allowing further clauses to be executed.

With all the other errors, program can continue to run, potentially to worst-case limits. The program will only break if the DB PROB BREAK register is set.

If indexing outside of the constant or the register range, causing an overflow error, the hardware is specified to return the value with an index of 0. This could be exploited to generate error tokens, by reserving and initializing the 0th register (or constant) for errors.

{ISSUE : Interrupt to the driver or not?}

#### 6.7.2 Method 2: Exporting the values in the GPRs

1) The sequencer will have a debug active, count register and an address register for this mode.

Under the normal mode execution follows the normal course.

Under the debug mode it is assumed that the program is always exporting n debug vectors and that all other exports to the SX block (position, color, z, ect) will been turned off (changed into NOPs) by the sequencer (even if they occur before the address stated by the ADDR debug register).

## 7. Pixel Kill Mask

A vector of 64 bits is kept by the sequencer per group of pixels/vertices. Its purpose is to optimize the texture fetch requests and allow the shader pipe to kill pixels using the following instructions:

MASK SETE MASK SETNE MASK SETGT MASK\_SETGTE

### 8. Multipass vertex shaders (HOS)

Multipass vertex shaders are able to export from the 6 last clauses but to memory ONLY.

Exhibit 2032.docR400\_Sequencer.doc 72136 Bytes\*\*\* © ATI Confidential. Reference Copyright Notice on Cover Page © \*\*\*



4 September, 20152 unt 200215 July

DOCOMENT TILLY, NOW. GEN-CXXXXX-REVA

FAGL

27 of 51

## 9. Register file allocation

The register file allocation for vertices and pixels can either be static or dynamic. In both cases, the register file in managed using two round robins (one for pixels and one for vertices). In the dynamic case the boundary between pixels and vertices is allowed to move, in the static case it is fixed to 128-VERTEX\_REG\_SIZE for vertices and PIXEL\_REG\_SIZE for pixels.

Exhibit 2032.docR400\_Sequencer.doc 72136 Bytes\*\*\* © ATI Confidential. Reference Copyright Notice on Cover Page © \*\*\*



Above is an example of how the algorithm works. Vertices come in from top to bottom; pixels come in from bottom to top. Vertices are in orange and pixels in green. The blue line is the tail of the vertices and the green line is the tail of the pixels. Thus anything between the two lines is shared. When pixels meets vertices the line turns white and the boundary is static until both vertices and pixels share the same "unallocated bubble". Then the boundary is allowed to move again. The numbering of the GPRs starts from the bottom of the picture at index 0 and goes up to the top at index 127.

### 10. Fetch Arbitration

The fetch arbitration logic chooses one of the n potentially pending fetch clauses to be executed. The choice is made by looking at the Vs and Ps reservation stations and picking the first one ready to execute. Once chosen, the clause state machine will send one 2x2 fetch per clock (or 4 fetches in one clock every 4 clocks) until all the fetch instructions of the clause are sent. This means that there cannot be any dependencies between two fetches of the same clause.

The arbitrator will not wait for the fetches to return prior to selecting another clause for execution. The fetch pipe will be able to handle up to X(?) in flight fetches and thus there can be a fair number of active clauses waiting for their fetch return data.

## 11. ALU Arbitration

ALU arbitration proceeds in almost the same way than fetch arbitration. The ALU arbitration logic chooses one of the n potentially pending ALU clauses to be executed. The choice is made by looking at the Vs and Ps reservation stations and picking the first one ready to execute. There are two ALU arbiters, one for the even clocks and one for the odd clocks. For example, here is the sequencing of two interleaved ALU clauses (E and O stands for Even and Odd sets of 4 clocks):

Einst0 Oinst0 Einst1 Oinst1 Einst2 Oinst2 Einst0 Oinst3 Einst1 Oinst4 Einst2 Oinst0...



4 September, 20152

GEN-CXXXXX-REVA

29 of 51

uing also secure serves

Proceeding this way hides the latency of 8 clocks of the ALUs. Also note that the interleaving also occurs across clause boundaries.

## 12. Handling Stalls

When the output file is full, the sequencer prevents the ALU arbitration logic from selecting the last clause (this way nothing can exit the shader pipe until there is place in the output file. If the packet is a vertex packet and the position buffer is full (POS\_FULL) then the sequencer also prevents a thread from entering an exporting clause. The sequencer will set the OUT\_FILE\_FULL signal n clocks before the output file is actually full and thus the ALU arbiter will be able read this signal and act accordingly by not preventing exporting clauses to proceed.

#### 13. Content of the reservation station FIFOs

The reservation FIFOs contain the state of the vector of pixels and vertices. We have two sets of those: one for pixels, and one for vertices. They contain 3 bits of Render State 7 bits for the base address of the GPRs, some bits for LOD correction and coverage mask information in order to fetch for only valid pixels, the quad address.

### 14. The Output File

The output file is where pixels are put before they go to the RBs. The write BW to this store is 256 bits/clock. Just before this output file are staging registers with write BW 512 bits/clock and read BW 256 bits/clock. The staging registers are 4x128 (and there are 16 of those on the whole chip).

#### 15. IJ Format

The IJ information sent by the PA is of this format on a per quad basis:

We have a vector of IJ's (one IJ per pixel at the centroid of the fragment or at the center of the pixel depending on the mode bit). All pixel's parameters are always interpolated at full 20x24 mantissa precision.

$$P0 = A + I(0) * (B - A) + J(0) * (C - A)$$

$$P1 = A + I(1) * (B - A) + J(1) * (C - A)$$

$$P2 = A + I(2) * (B - A) + J(2) * (C - A)$$

$$P3 = A + I(3) * (B - A) + J(3) * (C - A)$$



Multiplies (Full Precision): 8 Subtracts 19x24 (Parameters): 2

Adds: 8

FORMAT OF P's IJ: Mantissa 20 Exp 4 for I + Sign

Mantissa 20 Exp 4 for J + Sign

Total number of bits: 20\*8 + 4\*8 + 4\*2 = 200.

All numbers are kept using the un-normalized floating point convention: if exponent is different than 0 the number is normalized if not, then the number is un-normalized. The maximum range for the IJs (Full precision) is +/- 1024.

### 15.1 Interpolation of constant attributes

Because of the floating point imprecision, we need to take special provisions if all the interpolated terms are the same or if two of the terms are the same.



R400 Sequencer Specification

30 of 51

### 16. Staging Registers

In order for the reuse of the vertices to be 14, the sequencer will have to re-order the data sent IN ORDER by the VGT for it to be aligned with the parameter cache memory arrangement. Given the following group of vertices sent by the VGT:

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 || 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 || 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 || 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63

The sequencer will re-arrange them in this fashion:

0 1 2 3 16 17 18 19 32 33 34 35 48 49 50 51 || 4 5 6 7 20 21 22 23 36 37 38 39 52 53 54 55 || 8 9 10 11 24 25 26 27 40 41 42 43 56 57 58 59 || 12 13 14 15 28 29 30 31 44 45 46 47 60 61 62 63

The || markers show the SP divisions. In the event a shader pipe is broken, the SQ VGT will send is responsible to insert padding to account for the missing pipe. For example, if SP1 is broken, vertices 4 5 6 7 20 21 22 23 36 37 38 39 52 53 54 55 will still be not be sent by the VGT to the SQ-BUT AND the SQ is responsible to "jump" over these vertices in order for no valid vertices to be sent to an invalid SP-will not be processed by the SP and thus should be considered invalid (by the SU and VGT).

The most straightforward, non-compressed interface method would be to convert, in the VGT, the data to 32-bit floating point prior to transmission to the VSISRs. In this scenario, the data would be transmitted to (and stored in) the VSISRs in full 32-bit floating point. This method requires three 24-bit fixed-to-float converters in the VGT. Unfortunately, it also requires and additional 3,072 bits of storage across the VSISRs. This interface is illustrated in Figure 11Figure 11Figure 11. The area of the fixed-to-float converters and the VSISRs for this method is roughly estimated as 0.759sqmm using the R300 process. The gate count estimate is shown in Figure 10Figure 10Figure 10.



Figure 10: Area Estimate for VGT to Shader Interface

Exhibit 2032 docR400\_Sequencer.doc 72136 Bytes\*\*\* @ ATI Confidential. Reference Copyright Notice on Cover Page @ \*\*\*



Figure 11:VGT to Shader Interface

### 17. The parameter cache

The parameter cache is where the vertex shaders export their data. It consists of 16 128x128 memories (1R/1W). The reuse engine will make it so that all vertexes of a given primitive will hit different memories. The allocation method for these memories is a simple round robin. The parameter cache pointers are mapped in the following way: 4MSBs are the memory number and the 7 LSBs are the address within this memory.

| MEMORY NUMBER | ADDRESS |
|---------------|---------|
| 4 bits        | 7 bits  |

The PA generates the parameter cache addresses as the positions come from the SQ. All it needs to do is keep a Current\_Location pointer (7 bits only) and as the positions comes increment the memory number. When the memory number field wraps around, the PA increments the Current\_Location by VS\_EXPORT\_COUNT (a snooped register from the SQ). As an example, say the memories are all empty to begin with and the vertex shader is exporting 8 parameters per vertex (VS\_EXPORT\_COUNT = 8). The first position received is going to have the PC address 00000000000 the second one 00010000000, third one 00100000000 and so on up to 11110000000. Then the next position received (the 17<sup>th</sup>) is going to have the address 00000001000, the 18<sup>th</sup> 00010001000, the 19<sup>th</sup> 00100001000 and so on. The Current\_location is NEVER reset BUT on chip resets. The only thing to be careful about is that if the SX doesn't send you a full group of positions (<64) then you need to fill the address space so that the next group starts correctly aligned (for example if you receive only 33 positions then you need to add 2\*VS\_EXPORT\_COUNT to Current\_Location and reset the memory count to 0 before the next vector begins).



24 Sentember 2001

24 September, 2001 <u>4 September, 20152</u>

R400 Sequencer Specification

o ocquentosi oposinication

32 of 51

## 17.1 Export restrictions

#### 17.1.1 Pixel exports:

Pixels can export 1,2,3 or 4 color buffers to the SX(+z). The exports will be done in order. The PRED\_OPTIMIZE function has to be turned of if the exports are done using interleaved predicated instructions. The exports will always be ordered to the SX.

#### 17.1.2 Vertex exports:

Position or parameter caches can be exported in any order in the shader program. It is always better to export posistion as soon as possible. Position has to be exported in a single export block (no texture instructions can be placed between the exports). Parameter cache exports can be done in any order with texture instructions interleaved. The PRED\_OPTIMIZE function has to be turned of if the exports are done using interleaved predicated instructions to the Parameter cache (see Arbitration restrictions for details). The exports will always be allocated in order to the SX.

#### 17.1.3 Pass thru exports:

Pass thru exports have to be done in groups of the form:

```
Alloc 4 (8 or 12)
Execute ALU(ADDR) ALU(DATA) ALU(DATA) ALU(DATA)...
```

They cannot have texture instructions interleaved in the export block. These exports are not guaranteed to be ordered.

Also, when doing a pass thru export, Position MUST be exported AFTER all pass thru exports. This position export is used to synchronize the chip when doing a transition from pass thru shader to regular shader and vice versa.

#### 17.2 Arbitration restrictions

Here are the Sequencer arbitration restrictions:

- 1) Cannot execute a serialized thread if the corresponding texture pending bit is set
- 2) Cannot allocate position if any older thread has not allocated position
- 3) If last thread is marked as not valid AND marked as last and we are about to execute the second to oldest thread also marked last then:
  - a. Both threads must be from the same context (cannot allow a first thread)
  - b. Must turn off the predicate optimization for the second thread
- 4) Cannot execute a texture clause if texture reads are pending
- 5) Cannot execute last if texture pending (even if not serial)

### 18. Export Types

The export type (or the location where the data should be put) is specified using the destination address field in the ALU instruction. Here is a list of all possible export modes:

## 18.1 Vertex Shading

0:15 - 16 parameter cache 16:31 - Empty (Reserved?)

32 - Export Address

33:40 - 8 vertex exports to the frame buffer and index

41:47 - Empty

48:55 - 8 debug export (interpret as normal vertex export)

export addressing mode

61 - Empty

62 - position

Exhibit 2032.docR400\_Sequencer.doc 72136 Bytes\*\*\* © ATI Confidential. Reference Copyright Notice on Cover Page © \*\*\*

4 September, 20152

GEN-CXXXXX-REVA

DOCOMENT INEV. NOM.

33 of 51

- sprite size export that goes with position export (point\_h,point\_w,edgeflag,misc)

### 18.2 Pixel Shading

O - Color for buffer 0 (primary)

1 - Color for buffer 1

2 - Color for buffer 2 3 - Color for buffer 3

4:7 - Empty

8 - Buffer 0 Color/Fog (primary)

9 - Buffer 1 Color/Fog 10 - Buffer 2 Color/Fog 11 - Buffer 3 Color/Fog

12:15 - Empty

16:31 - Empty (Reserved?) 32 - Export Address

33:40 - 8 exports for multipass pixel shaders.

41:47 - Empty

48:55 - 8 debug exports (interpret as normal pixel export)

60 - export addressing mode

61:62 - Empty

- Z for primary buffer (Z exported to 'alpha' component)

### 19. Special Interpolation modes

#### 19.1 Real time commands

We are unable to use the parameter memory since there is no way for a command stream to write into it. Instead we need to add three 16x128 memories (one for each of three vertices x 16 interpolants). These will be mapped onto the register bus and written by type 0 packets, and output to the the parameter busses (the sequencer and/or PA need to be able to address the reatime parameter memory as well as the regular parameter store. For higher performance we should be able able to view them as two banks of 16 and do double buffering allowing one to be loaded, while the other is rasterized with. Most overlay shaders will need 2 or 4 scalar coordinates, one option might be to restrict the memory to 16x64 or 32x64 allowing only two interpolated scalars per cycle, the only problem I see with this is, if we view support for 16 vector-4 interpolants important (true only if we map Microsoft's high priority stream to the realtime stream), then the PA/sequencer need to support a realtime-specific mode where we need to address 32 vectors of parameters instead of 16. This mode is triggered by the primitive type: REAL TIME. The actual memories are in the in the SX blocks. The parameter data memories are hooked on the RBBM bus and are loaded by the CP using register mapped memory.

## 19.2 Sprites/ XY screen coordinates/ FB information

When working with sprites, one may want to overwrite the parameter 0 with SC generated data. Also, XY screen coordinates may be needed in the shader program. This functionality is controlled by the <u>param\_gen\_l0</u> register (in SQ) in conjunction with the SND\_XY register (in SC) <u>and the param\_gen\_pos</u>. Also it is possible to send the faceness information (for OGL front/back special operations) to the shader using the same control register. Here is a list of all the modes and how they interact together:

The Data is going to be written in the register specified by the param gen pos register.

Gen\_st is a bit taken from the interface between the SC and the SQ. This is the MSB of the primitive type. If the bit is set, it means we are dealing with Point AA, Line AA or sprite and in this case the vertex values are going to generated between 0 and 1.

Param\_Gen\_I0 disable, snd\_xy disable, no-gen\_st—I0 = No modification Param\_Gen\_I0 disable, snd\_xy disable, gen\_st—I0 = No modification Param\_Gen\_I0 disable, snd\_xy enable, no-gen\_st—I0 = No modification

Exhibit 2032 doc R400\_Sequencer.doc 72136 Bytes\*\*\* © ATI Confidential. Reference Copyright Notice on Cover Page © \*\*\*



ORIGINA I E DA I E 24 September, 2001

R400 Sequencer Specification

34 of 51

4 September, 20152 August 200215 July

Param Gen 10 disable, snd xy enable, gen st - 10 = No modification

Param Gen I0 enable, snd xy disable, no gen st - I0 = Sign(faceness)garbage,(Sign Point)-garbage,Sign(Line) garbage, facenesss, t

Param Gen I0 enable, snd xy disable, gen st - I0 = garbage, garbage, s, t

Param Gen I0 enable, snd xy enable, no gen st - I0 = Sign(faceness)screenX,(Sign Point)screenY,Sign(Line)s, t

#### In other words,

The generated vector is (X in RED, Y in GREEN, S in BLUE and T in ALPHA):

X,Y,S,T

These values are always supposed to be positive and any shader use of them should use the ABS function (as their sign bits will now be used for flags).

SignX = BackFacing

SignY = Point Primitive

SignS = Line Primitive

SignT = currently unused as a flag.

#### If !Point & !Line, then it is a Poly.

I would assume that one implementation which allows for generic texture lookup (using 3D maps) for poly stipple and AA for the driver would be

if(Y<0) {

R = 0.0 (Point)

} else if (S < 0) {

R = 1.0 (Line)

} else {

R = 2.0 (Poly)

}screen x, screen y, garbage, faceness

Param Gen 10 enable, snd xy enable, gen st - 10 = screen x, screen y, s, t

#### 19.3 Auto generated counters

In the cases we are dealing with multipass shaders, the sequencer is going to generate a vector count to be able to both use this count to write the 1st pass data to memory and then use the count to retrieve the data on the 2nd pass. The count is always generated in the same way but it is passed to the shader in a slightly different way depending on the shader type (pixel or vertex). This is toggled on and off using the GEN\_INDEX register. The sequencer is going to keep two counters, one for pixels and one for vertices. Every time a full vector of vertices or pixels is written to the GPRs the counter is incremented. Every time a state change is detected, the corresponding counter is reset. While there is only one count broadcast to the GPRs, the LSB are hardwired to specific values making the index different for all elements in the vector.

#### 19.3.1 Vertex shaders

In the case of vertex shaders, if GEN\_INDEX is set, the data will be put into the x field of the third register (it means that the compiler must allocate 3 GPRs in all multipass vertex shader modes).

#### 19.3.2 Pixel shaders

In the case of pixel shaders, if GEN INDEX is set-and Param Gen 10 is enabled, the data will be put in the x field of the 2<sup>nd</sup>-param gen pos+1 register (R1.x), else if GEN INDEX is set the data will be put into the x field of the 1<sup>st</sup> register (R0.x).



Figure 12: GPR input mux Control

#### 20. State management

Every clock, the sequencer will report to the CP the oldest states still in the pipe. These are the states of the programs as they enter the last ALU clause.

#### 20.1 Parameter cache synchronization

In order for the sequencer not to begin a group of pixels before the associated group of vertices has finished, the sequencer will keep a 6 bit count per state (for a total of 8 counters). These counters are initialized to 0 and every time a vertex shader exports its data TO THE PARAMETER CACHE, the corresponding pointer is incremented. When the SC sends a new vector of pixels with the SC\_SQ\_new\_vector bit asserted, the sequencer will first check if the count is greater than 0 before accepting the transmission (it will in fact accept the transmission but then lower its ready to receive). Then the sequencer waits for the count to go to one and decrements it. The sequencer can then issue the group of pixels to the interpolators. Every time the state changes, the new state counter is initialized to 0.

## 21. XY Address imports

The SC will be able to send the XY addresses to the GPRs. It does so by interleaving the writes of the IJs (to the IJ buffer) with XY writes (to the XY buffer). Then when writing the data to the GPRs, the sequencer is going to interpolate the IJ data or pass the XY data thru a Fix→float converter and expander and write the converted values to the GPRs. The Xys are currently SCREEN SPACE COORDINATES. The values in the XY buffers will wrap. See section 19.2 for details on how to control the interpolation in this mode.

### 21.1 Vertex indexes imports

In order to import vertex indexes, we have 16 8x96 staging registers. These are loaded one line at a time by the VGT block (96 bits). They are loaded in floating point format and can be transferred in 4 or 8 clocks to the GPRs.

## 22. Registers

Please see the auto-generated web pages for register definitions.

Exhibit 2032. doc R400\_Sequencer.doc 72136 Bytes\*\*\* © ATI Confidential. Reference Copyright Notice on Cover Page © \*\*\*



4 September, 20152

R400 Sequencer Specification

36 of 51

#### 23. <u>Interfaces</u>

#### 23.1 External Interfaces

Whenever an x is used, it means that the bus is broadcast to all units of the same name. For example, if a bus is named SQ→SPx it means that SQ is going to broadcast the same information to all SP instances.

#### 23.2 SC to SP Interfaces

#### 23.2.1 SC\_SP#

There is one of these interfaces at front of each of the SP (buffer to stage pixel interpolators). This interface transmits the I,J data for pixel interpolation. For the entire system, two quads per clock are transferred to the 4 SPs, so each of these 4 interfaces transmits one half of a quad per clock. The interface below describes a half of a quad worth of data.

The actual data which is transferred per quad is

Ref Pix I => S4.20 Floating Point I value \*4
Ref Pix J => S4.20 Floating Point J value \*4

This equates to a total of 200 bits which transferred over 2 clocks and therefor needs an interface 100 bits wide

Additionally, X,Y data (12-bit unsigned fixed) is conditionally sent across this data bus over the same wires in an additional clock. The X,Y data is sent on the lower 24 bits of the data bus with faceness in the msb. Transfers across these interfaces are synchronized with the SC SQ IJ Control Bus transfers.

The data transfer across each of these busses is controlled by a IJ\_BUF\_INUSE\_COUNT in the SC. Each time the SC has sent a pixel vector's worth of data to the SPs, he will increment the IJ\_BUF\_INUSE\_COUNT count. Prior to sending the next pixel vectors data, he will check to make sure the count is less than MAX\_BUFER\_MINUS\_2, if not the SC will stall until the SQ returns a pipelined pulse to decrement the count when he has scheduled a buffer free. Note: We could/may optimize for the case of only sending only IJ to use all the buffers to pre-load more. Currently it is planned for the SP to hold 2 double buffers of I,J data and two buffers of X,Y data, so if either X,Y or Centers and Centroids are on, then the SC can send two Buffers.

In at least the initial version, the SC shall send 16 quads per pixel vector even if the vector is not full. This will increment buffer write address pointers correctly all the time. (We may revisit this for both the SX,SP,SQ and add a EndOfVector signal on all interfaces to quit early. We opted for the simple mode first with a belief that only the end of packet and multiple new vector signals should cause a partial vector and that this would not really be significant performance hit.)

| Name                  | Bits | Description                                                                      |  |  |  |  |  |
|-----------------------|------|----------------------------------------------------------------------------------|--|--|--|--|--|
| SC_SP#_data           | 100  | IJ information sent over 2 clocks (or X,Y in 24 LSBs with faceness in upper bit) |  |  |  |  |  |
|                       |      | Type 0 or 1, First clock I, second clk J                                         |  |  |  |  |  |
|                       |      | Field ULC URC LLC LRC                                                            |  |  |  |  |  |
|                       |      | Bits [63:39] [38:26] [25:13] [12:0]                                              |  |  |  |  |  |
|                       |      | Format SE4M20 SE4M20 SE4M20                                                      |  |  |  |  |  |
|                       |      | Type 2                                                                           |  |  |  |  |  |
|                       |      | Field Face X Y                                                                   |  |  |  |  |  |
|                       |      | Bits [63] [23:12] [11:0]                                                         |  |  |  |  |  |
|                       |      | Format Bit Unsigned Unsigned                                                     |  |  |  |  |  |
|                       |      |                                                                                  |  |  |  |  |  |
| SC_SP#_valid          | 1    | Valid                                                                            |  |  |  |  |  |
| SC_SP#_last_quad_data | 1    | This bit will be set on the last transfer of data per quad.                      |  |  |  |  |  |
| SC_SP#_type           | 2    | 0 -> Indicates centroids                                                         |  |  |  |  |  |
|                       |      | 1 -> Indicates centers                                                           |  |  |  |  |  |
|                       |      | 2 -> Indicates X,Y Data and faceness on data bus                                 |  |  |  |  |  |
|                       |      | The SC shall look at state data to determine how many types to send for the      |  |  |  |  |  |

Exhibit 2032.docR400\_Sequencer.doc 72136 Bytes\*\*\* © ATI Confidential. Reference Copyright Notice on Cover Page © \*\*\*



DOCONLINI TILV. NOM. GEN-CXXXXX-REVA

37 of 51

interpolation process

The # is included for clarity in the spec and will be replaced with a prefix of u#\_ in the verilog module statement for the SC and the SP block will have neither because the instantiation will insert the prefix.

#### 23.2.2 SC\_SQ

This is the control information sent to the sequencer in order to synchronize and control the interpolation and/or loading data into the GPRs needed to execute a shader program on the sent pixels. This data will be sent over two clocks per transfer with 1 to 16 transfers. Therefore the bus (approx 94 bits) could be folded in half to approx 49 bits.

| Name        | Bits | Description                                                         |
|-------------|------|---------------------------------------------------------------------|
| SC_SQ_data  | 46   | Control Data sent to the SQ                                         |
|             |      | 1 clk transfers                                                     |
|             |      | Event – valid data consist of event_id and                          |
|             |      | state_id. Instruct SQ to post an                                    |
|             |      | event vector to send state id and                                   |
|             |      | event_id through request fifo                                       |
|             |      | and onto the reservation stations                                   |
|             |      | making sure state id and/or event_id                                |
|             |      | gets back to the CP. Events only                                    |
|             |      | follow end of packets so no pixel                                   |
|             |      | vectors will be in progress.                                        |
|             |      | Empty Quad Mask – Transfer Control data                             |
|             |      | consisting of pc_dealloc                                            |
|             |      | or new_vector. Receipt of this is to                                |
|             |      | transfer pc_dealloc or new_vector                                   |
|             |      | without any valid quad data. New                                    |
|             |      | vector will always be posted to                                     |
|             |      | request fifo and pc_dealloc will be                                 |
|             |      | attached to any pixel vector                                        |
|             |      | outstanding or posted in request fifo if no valid guad outstanding. |
|             |      | 2 clk transfers                                                     |
|             |      | Quad Data Valid – Sending quad data with or                         |
|             |      | without new vector or pc dealloc.                                   |
|             |      | New vector will be posted to request                                |
|             |      | fifo with or without a pixel vector and                             |
|             |      | pc_dealloc will be posted with a pixel                              |
|             |      | vector unless none is in progress. In                               |
|             |      | this case the pc_dealloc will be                                    |
|             |      | posted in the request queue.                                        |
|             |      | Filler quads will be transferred with                               |
|             |      | The Quad mask set but the pixel                                     |
|             |      | corresponding pixel mask set to                                     |
|             |      | zero.                                                               |
| SC_SQ_valid | 1    | SC sending valid data, 2 <sup>nd</sup> clk could be all zeroes      |

SC\_SQ\_data - first clock and second clock transfers are shown in the table below.

| Name                           | BitField | Bits | Description                                                                       |
|--------------------------------|----------|------|-----------------------------------------------------------------------------------|
| 1 <sup>st</sup> Clock Transfer |          |      |                                                                                   |
| SC_SQ_event                    | 0        | 1    | This transfer is a 1 clock event vector Force quad_mask = new_vector=pc_dealloc=0 |
| SC_SQ_event_id                 | [4:1]    | 4    | This field identifies the event 0 => denotes an End Of State Event 1 => TBD       |

|                        |              |         |                              |                                      | Tites and described about the contraction.                    |            |  |  |
|------------------------|--------------|---------|------------------------------|--------------------------------------|---------------------------------------------------------------|------------|--|--|
|                        | 24 September | , 2001  |                              | tember, 20152                        |                                                               | 38 of 51   |  |  |
| SC_SQ_pc_dealloc [7:5] |              | [7:5]   | 3                            |                                      | Deallocation token for the Parameter Cache                    |            |  |  |
| SC_SQ_new_             | vector       | 8       | 1                            | The SQ must w                        | The SQ must wait for Vertex shader done count > 0 and after   |            |  |  |
|                        |              |         |                              | dispatching the                      | dispatching the Pixel Vector the SQ will decrement the count. |            |  |  |
| SC_SQ_quad             | _mask        | [12:9]  | 4                            | Quad Write mas                       | sk left to right SP0 => SP3                                   |            |  |  |
| SC_SQ_end_e            | of_prim      | 13      | 1                            | End Of the prim                      | itive                                                         |            |  |  |
| SC_SQ_state            | _id          | [16:14] | 3                            | State/constant                       | pointer (6*3+3)                                               |            |  |  |
| SC_SQ_pix_m            | nask         | [32:17] | 16                           | Valid bits for all                   | pixels SP0=>SP3 (UL,UR,LL,LR)                                 |            |  |  |
| SC_SQ_provo            | k_vtx        | [37:36] | 2                            | Provoking verte                      | Provoking vertex for flat shading                             |            |  |  |
| SC_SQ_pc_ptr0 [48:38]  |              | [48:38] | 11                           | Parameter Cache pointer for vertex 0 |                                                               |            |  |  |
|                        |              |         |                              |                                      |                                                               |            |  |  |
| 2nd Clock Tra          | nsfer        |         |                              |                                      |                                                               |            |  |  |
| SC_SQ_pc_pt            | tr1          | [10:0]  | 11                           | Parameter Cache pointer for vertex 1 |                                                               |            |  |  |
| SC_SQ_pc_pt            | tr2          | [21:11] | 11                           | Parameter Cac                        | meter Cache pointer for vertex 2                              |            |  |  |
| SC_SQ_lod_c            | orrect       | [45:22] | 24                           | LOD correction                       | per quad (6 bits per quad)                                    |            |  |  |
| SC_SQ_prim_            | type         | [48:46] | 3                            | Stippled line an                     | d Real time command need to load te                           | cords from |  |  |
|                        |              |         |                              | alternate buffer                     |                                                               |            |  |  |
|                        |              |         |                              | 000: Sprite (poi                     | nt)                                                           |            |  |  |
|                        |              |         |                              | 001: Line                            |                                                               |            |  |  |
|                        |              |         | 010: Tri_rect                |                                      |                                                               |            |  |  |
|                        |              |         | 100: Realtime Sprite (point) |                                      |                                                               |            |  |  |
|                        |              |         |                              | 101: Realtime L                      |                                                               |            |  |  |
|                        |              |         |                              | 110: Realtime T                      | ri_rect                                                       |            |  |  |

| Name               | Bits | Description                                                                   |
|--------------------|------|-------------------------------------------------------------------------------|
| SQ_SC_free_buff    | 1    | Pipelined bit that instructs SC to decrement count of buffers in use.         |
| SQ_SC_dec_cntr_cnt | 1    | Pipelined bit that instructs SC to decrement count of new vector and/or event |
|                    |      | sent to prevent SC from overflowing SQ interpolator/Reservation request fifo. |

The scan converter will submit a partial vector whenever:

- 1.) He gets a primitive marked with an end of packet signal.
- 2.) A current pixel vector is being assembled with at least one or more valid quads and the vector has been marked for deallocate when a primitive marked new\_vector arrives. The Scan Converter will submit a partial vector (up to 16quads with zero pixel mask to fill out the vector) prior to submitting the new\_vector marker\primitive.

(This will prevent a hang which can be demonstrated when all primitives in a packet three vectors are culled except for a one quad primitive that gets marked pc\_dealloc (vertices maximum size). In this case two new\_vectors are submitted and processed, but then one valid quad with the pc\_dealloc creates a vector and then the new would wait for another vertex vector to be processed, but the one being waited for could never export until the pc\_dealloc signal made it through and thus the hang.)

### 23.2.3 SQ to SX(SP): Interpolator bus

| Name                       | Direction | Bits | Description                                  |
|----------------------------|-----------|------|----------------------------------------------|
| SQ_SPx_interp_flat_vtx     | SQ→SPx    | 2    | Provoking vertex for flat shading            |
| SQ_SPx_interp_flat_gouraud | SQ→SPx    | 1    | Flat or gouraud shading                      |
| SQ_SPx_interp_cyl_wrap     | SQ→SPx    | 4    | Wich channel needs to be cylindrical wrapped |
| SQ_SXx_pc_ptr0             | SQ→SXx    | 11   | Parameter Cache Pointer                      |
| SQ_SXx_pc_ptr1             | SQ→SXx    | 11   | Parameter Cache Pointer                      |
| SQ_SXx_pc_ptr2             | SQ→SXx    | 11   | Parameter Cache Pointer                      |
| SQ_SXx_rt_sel              | SQ→SXx    | 1    | Selects between RT and Normal data           |
| SQ_SXx_pc_wr_en            | SQ→SXx    | 1    | Write enable for the PC memories             |
| SQ_SXx_pc_wr_addr          | SQ→SXx    | 7    | Write address for the PCs                    |
| SQ_SXx_pc_channel_mask     | SQ→SXx    | 4    | Channel mask                                 |
| SQ_SXx_pc_ptr_valid        | SQ→SXx    | 1    | Read pointers are valid.                     |
| SQ_SPx_interp_valid        | SQ→SPx    | 1    | Interpolation control valid                  |

#### 23.2.4 SQ to SP: Staging Register Data

This is a broadcast bus that sends the VSISR information to the staging registers of the shader pipes.

| <b>6600</b>              | 24 September, 2001 |           | 4 Septem |                                                        |         | GEN-CXXXXX-REVA                         | 39 of 51 |
|--------------------------|--------------------|-----------|----------|--------------------------------------------------------|---------|-----------------------------------------|----------|
| Name                     |                    | Direction |          | Bits                                                   | Descri  | otion                                   |          |
| SQ_SPx_vsr_c             | lata               | SQ→SPx    |          | 96                                                     | Pointer | s of indexes or HOS surface information | ا ا      |
| SQ SPx vsr double SQ→SPx |                    |           | 1        | 0: Normal 96 bits per vert 1: double 192 bits per vert |         |                                         |          |
| SQ_SP0_vsr_              | valid              | SQ→SP0    |          | 1                                                      | Data is | valid                                   |          |
| SQ SP1 vsr valid SQ→SP1  |                    |           | 1        | Data is                                                | valid   |                                         |          |
| SQ_SP2_vsr_valid SQ→SP2  |                    |           | 1        | Data is                                                | valid   |                                         |          |
| SQ_SP3_vsr_valid SQ→SP3  |                    |           | 1        | Data is                                                | valid   |                                         |          |
| SQ SPx vsr r             | ead                | SQ→SPx    |          | 1                                                      | Increme | ent the read pointers                   |          |

LUII DAIL

23.2.5 VGT to SQ: Vertex interface

### 23.2.5.1 Interface Signal Table

The area difference between the two methods is not sufficient to warrant complicating the interface or the state requirements of the VSISRs. Therefore, the POR for this interface is that the VGT will transmit the data to the VSISRs (via the Shader Sequencer) in full, 32-bit floating-point format. The VGT can transmit up to six 32-bit floating-point values to each VSISR where four or more values require two transmission clocks. The data bus is 96 bits wide.

| Name                   | Bits | Description                                                                         |  |  |  |  |
|------------------------|------|-------------------------------------------------------------------------------------|--|--|--|--|
| VGT_SQ_vsisr_data      | 96   | Pointers of indexes or HOS surface information                                      |  |  |  |  |
| VGT_SQ_event           | 1    | VGT is sending an event                                                             |  |  |  |  |
| VGT_SQ_vsisr_continued | 1    | 0: Normal 96 bits per vert 1: double 192 bits per vert                              |  |  |  |  |
| VGT_SQ_end_of_vtx_vect | 1    | Indicates the last VSISR data set for the current process vector (for double vector |  |  |  |  |
|                        |      | data, "end_of_vector" is set on the first vector)                                   |  |  |  |  |
| VGT_SQ_indx_valid      | 1    | Vsisr data is valid                                                                 |  |  |  |  |
| VGT_SQ_state           | 3    | Render State (6*3+3 for constants). This signal is guaranteed to be correct when    |  |  |  |  |
|                        |      | "VGT_SQ_vgt_end_of_vector" is high.                                                 |  |  |  |  |
| VGT_SQ_send            | 1    | Data on the VGT_SQ is valid receive (see write-up for standard R400 SEND/RT         |  |  |  |  |
|                        |      | interface handshaking)                                                              |  |  |  |  |
| SQ_VGT_rtr             | 1    | Ready to receive (see write-up for standard R400 SEND/RTR interface                 |  |  |  |  |
|                        |      | handshaking)                                                                        |  |  |  |  |

#### 23.2.5.2 Interface Diagrams





Figure 1. Detailed Logical Diagram for PA\_SQ\_vgt Interface.

4 September, 20152

R400 Sequencer Specification

42 of 51

23.2.6 SQ to SX: Control bus

| Name               | Direction | Bits | Description                                                                                                                                                                                                                                                                                                                                                                                    |
|--------------------|-----------|------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| SQ_SXx_exp_type    | SQ→SXx    | 2    | 00: Pixel without z (1 to 4 buffers) 01: Pixel with z (1 to 4 buffers) 10: Position (1 or 2 results) 11: Pass thru (4,8 or 12 results aligned)                                                                                                                                                                                                                                                 |
| SQ_SXx_exp_number  | SQ→SXx    | 2    | Number of locations needed in the export buffer (encoding depends on the type see bellow).                                                                                                                                                                                                                                                                                                     |
| SQ_SXx_exp_alu_id  | SQ→SXx    | 1    | ALU ID                                                                                                                                                                                                                                                                                                                                                                                         |
| SQ_SXx_exp_valid   | SQ→SXx    | 1    | Valid bit                                                                                                                                                                                                                                                                                                                                                                                      |
| SQ_SXx_exp_state   | SQ→SXx    | 3    | State Context                                                                                                                                                                                                                                                                                                                                                                                  |
| SQ_SXx_free_done   | SQ→SXx    | 1    | Pulse to indicate that the previous export is finished (this can be sent with or without the other fields of the interface) Pulse that indicates that the previous export is finished from the point of view of the SP. This does not necessarily mean that the data has been transferred to RB or PA, or that the space in export buffer for that particular vector thread has been freed up. |
| SQ_SXx_free_alu_id | SQ→SXx    | 1    | ALUID                                                                                                                                                                                                                                                                                                                                                                                          |

Depending on the type the number of export location changes:

- Type 00 : Pixels without Z
  - o 00 = 1 buffer
  - 01 = 2 buffers
  - 10 = 3 buffers
  - o 11 = 4 buffer
- Type 01: Pixels with Z
  - $\circ$  00 = 2 Buffers (color + Z)
  - o 01 = 3 buffers (2 color + Z)
  - o 10 = 4 buffers (3 color + Z)
  - o 11 = 5 buffers (4 color + Z)
- Type 10: Position export
  - $\circ$  00 = 1 position
  - o 01 = 2 positions
  - o 1X = Undefined
- Type 11: Pass Thru
  - o 00 = 4 buffers
  - 01 = 8 buffers
  - 10 = 12 buffers 0
  - 11 = Undefined

Below the thick black line is the end of transfer packet that tells the SX that a given export is finished. The report packet will always arrive either before or at the same time than the next export to the same ALU id.

## 23.2.7 SX to SQ: Output file control

| Name                 | Direction | Bits | Description                                                                                                                                     |
|----------------------|-----------|------|-------------------------------------------------------------------------------------------------------------------------------------------------|
| SXx_SQ_exp_count_rdy | SXx→SQ    | 1    | Raised by SX0 to indicate that the following two fields reflect the result of the most recent export                                            |
| SXx_SQ_exp_pos_avail | SXx→SQ    | 1    | Specifies whether there is room for another position.                                                                                           |
| SXx_SQ_exp_buf_avail | SXx→SQ    | 7    | Specifies the space available in the output buffers.  0: buffers are full  1: 2K-bits available (32-bits for each of the 64 pixels in a clause) |

Exhibit 2032.docR400\_Sequencer.doc 72136 Bytes\*\*\* © ATI Confidential. Reference Copyright Notice on Cover Page © \*\*\*

| 24 September | 4 Septer | nber, 20152 | GEN-CXXXXX-REVA                                                     | 43 of 51    | TOTAL CONTRACTOR CONTR |
|--------------|----------|-------------|---------------------------------------------------------------------|-------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|              |          | 64 p        | 128K-bits available (16 128-bit entries<br>bixels)<br>127: RESERVED | for each of |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |

#### 23.2.8 SQ to TP: Control bus

Once every clock, the fetch unit sends to the sequencer on which RS line it is now working and if the data in the GPRs is ready or not. This way the sequencer can update the fetch valid bits flags for the reservation station. The sequencer also provides the instruction and constants for the fetch to execute and the address in the register file where to write the fetch return data.

| Name                   | Direction | Bits | Description                                               |
|------------------------|-----------|------|-----------------------------------------------------------|
| TPx_SQ_data_rdy        | TPx→ SQ   | 1    | Data ready                                                |
| TPx_SQ_rs_line_num     | TPx→ SQ   | 6    | Line number in the Reservation station                    |
| TPx_SQ_type            | TPx→ SQ   | 1    | Type of data sent (0:PIXEL, 1:VERTEX)                     |
| SQ_TPx_send            | SQ→TPx    | 1    | Sending valid data                                        |
| SQ_TPx_const           | SQ→TPx    | 48   | Fetch state sent over 4 clocks (192 bits total)           |
| SQ_TPx_instr           | SQ→TPx    | 24   | Fetch instruction sent over 4 clocks                      |
| SQ_TPx_end_of_group    | SQ→TPx    | 1    | Last instruction of the group                             |
| SQ_TPx_Type            | SQ→TPx    | 1    | Type of data sent (0:PIXEL, 1:VERTEX)                     |
| SQ_TPx_gpr_phase       | SQ→TPx    | 2    | Write phase signal                                        |
| SQ_TP0_lod_correct     | SQ→TP0    | 6    | LOD correct 3 bits per comp 2 components per quad         |
| SQ_TP0_pix_mask        | SQ→TP0    | 4    | Pixel mask 1 bit per pixel                                |
| SQ_TP1_lod_correct     | SQ→TP1    | 6    | LOD correct 3 bits per comp 2 components per quad         |
| SQ_TP1_pix_mask        | SQ→TP1    | 4    | Pixel mask 1 bit per pixel                                |
| SQ_TP2_lod_correct     | SQ→TP2    | 6    | LOD correct 3 bits per comp 2 components per quad         |
| SQ_TP2_pix_mask        | SQ→TP2    | 4    | Pixel mask 1 bit per pixel                                |
| SQ_TP3_lod_correct     | SQ→TP3    | 6    | LOD correct 3 bits per comp 2 components per quad         |
| SQ_TP3_pix_mask        | SQ→TP3    | 4    | Pixel mask 1 bit per pixel                                |
| SQ_TPx_rs_line_num     | SQ→TPx    | 6    | Line number in the Reservation station                    |
| SQ_TPx_write_gpr_index | SQ->TPx   | 7    | Index into Register file for write of returned Fetch Data |

## 23.2.9 TP to SQ: Texture stall

The TP sends this signal to the SQ and the SPs when its input buffer is full.



| Name | Direction | Bits | Description |
|------|-----------|------|-------------|



R400 Sequencer Specification

44 of 51

TP\_SQ\_fetch\_stall  $TP \rightarrow SQ$  1 Do not send more texture request if asserted

#### 23.2.10 SQ to SP: Texture stall

| Name               | Direction | Bits | Description                                  |
|--------------------|-----------|------|----------------------------------------------|
| SQ_SPx_fetch_stall | SQ→SPx    | 1    | Do not send more texture request if asserted |

#### 23.2.11 SQ to SP: GPR and auto counter

| Name                 | Direction | Bits | Description                                                                                                                      |
|----------------------|-----------|------|----------------------------------------------------------------------------------------------------------------------------------|
| SQ_SPx_gpr_wr_addr   | SQ→SPx    | 7    | Write address                                                                                                                    |
| SQ_SPx_gpr_rd_addr   | SQ→SPx    | 7    | Read address                                                                                                                     |
| SQ_SPx_gpr_rd_en     | SQ→SPx    | 1    | Read Enable                                                                                                                      |
| SQ_SP0_gpr_wr_en     | SQ→SPx    | 1    | Write Enable for the GPRs of SP0                                                                                                 |
| SQ_SP1_gpr_wr_en     | SQ→SPx    | 1    | Write Enable for the GPRs of SP1                                                                                                 |
| SQ_SP2_gpr_wr_en     | SQ→SPx    | 1    | Write Enable for the GPRs of SP2                                                                                                 |
| SQ_SP3_gpr_wr_en     | SQ→SPx    | 1    | Write Enable for the GPRs of SP3                                                                                                 |
| SQ_SPx_gpr_phase     | SQ→SPx    | 2    | The phase mux (arbitrates between inputs, ALU SRC reads and writes)                                                              |
| SQ_SPx_channel_mask  | SQ→SPx    | 4    | The channel mask                                                                                                                 |
| SQ_SPx_gpr_input_sel | SQ→SPx    | 2    | When the phase mux selects the inputs this tells from which source to read from: Interpolated data, VTX0, VTX1, autogen counter. |
| SQ_SPx_auto_count    | SQ→SPx    | 12?  | Auto count generated by the SQ, common for all shader pipes                                                                      |

4 September, 20152

GEN-CXXXXX-REVA

rag\_ 45 of 51

23.2.12 SQ to SPx: Instructions

| Name                  | Direction | Bits | Description                                                                                                                                                     |
|-----------------------|-----------|------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------|
| SQ_SPx_instr_start    | SQ→SPx    | 1    | Instruction start                                                                                                                                               |
| SQ_SP_instr           | SQ→SPx    | 22   | Transferred over 4 cycles  0: SRC A Select 2:0 SRC A Argument Modifier 3:3 SRC A swizzle 11:4 VectorDst 17:12 Per channel use mask (PV/Reg) 21:18               |
|                       |           |      | 1: SRC B Select 2:0     SRC B Argument Modifier 3:3     SRC B swizzle 11:4     ScalarDst 17:12     Per channel use mask (PV/Reg) 21:18                          |
|                       |           |      | 2: SRC C Select 2:0 SRC C Argument Modifier 3:3 SRC C swizzle 11:4 Per channel use mask (PV/Reg) 21:18                                                          |
|                       |           |      | 3: Vector Opcode 4:0 Scalar Opcode 10:5 Vector Clamp 11:11 Scalar Clamp 12:12 Vector Write Mask 16:13 Scalar Write Mask 20:17                                   |
| SQ_SPx_exp_alu_id     | SQ→SPx    | 1    | ALU ID                                                                                                                                                          |
| SQ_SPx_exporting      | SQ→SPx    | 12   | 0: Not Exporting 1: Vector Exporting 2: Scalar Exporting                                                                                                        |
| SQ_SPx_stall          | SQ→SPx    | 1    | Stall signal                                                                                                                                                    |
| SQ_SP0_write_mask     | SQ→SP0    | 4    | Result of pixel kill in the shader pipe, which must be output for all pixel exports (depth and all color buffers). 4x4 because 16 pixels are computed per clock |
| SQ_SP1_ write_mask    | SQ→SP1    | 4    | Result of pixel kill in the shader pipe, which must be output for all pixel exports (depth and all color buffers). 4x4 because 16 pixels are computed per clock |
| SQ_SP2_ write_mask    | SQ→SP2    | 4    | Result of pixel kill in the shader pipe, which must be output for all pixel exports (depth and all color buffers). 4x4 because 16 pixels are computed per clock |
| SQ_SP3_ write_mask    | SQ→SP3    | 4    | Result of pixel kill in the shader pipe, which must be output for all pixel exports (depth and all color buffers). 4x4 because 16 pixels are computed per clock |
| SQ_SPx_last           | SQ→SPx    | 1    | Last instruction of the block                                                                                                                                   |
| SQ_SP0_pred_overwrite | SQ→SP0    | 4    | Indicates to overwrite the use of PV/PS because of the predication (use the GPRs instead). This operation is done on a per-pixel basis.                         |
| SQ_SP1_pred_overwrite | SQ→SP1    | 4    | Indicates to overwrite the use of PV/PS because of the predication (use the GPRs instead). This operation is done on a per-pixel basis.                         |
| SQ_SP2_pred_overwrite | SQ→SP2    | 4    | Indicates to overwrite the use of PV/PS because of                                                                                                              |

| 4 |          | 24 September, 20 | 001 <u>4 Septemb</u> | ~~~~~ | 46 of 51                                                                                                                                |  |
|---|----------|------------------|----------------------|-------|-----------------------------------------------------------------------------------------------------------------------------------------|--|
|   |          |                  |                      |       | the predication (use the GPRs instead). This operation is done on a per-pixel basis.                                                    |  |
|   | SQ_SP3_p | ored_overwrite   | SQ→SP3               | 4     | Indicates to overwrite the use of PV/PS because of the predication (use the GPRs instead). This operation is done on a per-pixel basis. |  |

#### 23.2.13 SP to SQ: Constant address load/ Predicate Set

| Name              | Direction | Bits | Description                                                 |
|-------------------|-----------|------|-------------------------------------------------------------|
| SP0_SQ_const_addr | SP0→SQ    | 36   | Constant address load / predicate vector load (4 bits only) |
|                   |           |      | to the sequencer                                            |
| SP0_SQ_valid      | SP0→SQ    | 1    | Data valid                                                  |
| SP1_SQ_const_addr | SP1→SQ    | 36   | Constant address load / predicate vector load (4 bits only) |
|                   |           |      | to the sequencer                                            |
| SP1_SQ_valid      | SP1→SQ    | 1    | Data valid                                                  |
| SP2_SQ_const_addr | SP2→SQ    | 36   | Constant address load / predicate vector load (4 bits only) |
|                   |           |      | to the sequencer                                            |
| SP2_SQ_valid      | SP2→SQ    | 1    | Data valid                                                  |
| SP3_SQ_const_addr | SP3→SQ    | 36   | Constant address load / predicate vector load (4 bits only) |
|                   |           |      | to the sequencer                                            |
| SP3_SQ_valid      | SP3→SQ    | 1    | Data valid                                                  |
| SP0_SQ_data_type  | SP→SQ     | 1    | Data Type                                                   |
|                   |           |      | 0: Constant Load                                            |
|                   |           |      | 1: Predicate Set                                            |

#### 23.2.14 SQ to SPx: constant broadcast

| Name         | Direction | Bits | Description        |
|--------------|-----------|------|--------------------|
| SQ_SPx_const | SQ→SPx    | 128  | Constant broadcast |

#### 23.2.15 SP0 to SQ: Kill vector load

| Name             | Direction | Bits | Description      |
|------------------|-----------|------|------------------|
| SP0_SQ_kill_vect | SP0→SQ    | 4    | Kill vector load |
| SP1_SQ_kill_vect | SP1→SQ    | 4    | Kill vector load |
| SP2_SQ_kill_vect | SP2→SQ    | 4    | Kill vector load |
| SP3_SQ_kill_vect | SP3→SQ    | 4    | Kill vector load |

#### 23.2.16 SQ to CP: RBBM bus

| Name           | Direction | Bits | Description          |
|----------------|-----------|------|----------------------|
| SQ_RBB_rs      | SQ→CP     | 1    | Read Strobe          |
| SQ_RBB_rd      | SQ→CP     | 32   | Read Data            |
| SQ_RBBM_nrtrtr | SQ→CP     | 1    | Optional             |
| SQ RBBM rtr    | SQ→CP     | 1    | Real-Time (Optional) |

#### 23.2.17 CP to SQ: RBBM bus

| Name               | Direction | Bits | Description                        |
|--------------------|-----------|------|------------------------------------|
| rbbm_we            | CP→SQ     | 1    | Write Enable                       |
| rbbm_a             | CP→SQ     | 15   | Address Upper Extent is TBD (16:2) |
| rbbm_wd            | CP→SQ     | 32   | Data                               |
| rbbm_be            | CP→SQ     | 4    | Byte Enables                       |
| rbbm_re            | CP→SQ     | 1    | Read Enable                        |
| rbb_rs0            | CP→SQ     | 1    | Read Return Strobe 0               |
| rbb_rs1            | CP→SQ     | 1    | Read Return Strobe 1               |
| rbb_rd0            | CP→SQ     | 32   | Read Data 0                        |
| rbb_rd1            | CP→SQ     | 32   | Read Data 0                        |
| RBBM_SQ_soft_reset | CP→SQ     | 1    | Soft Reset                         |



4 September, 20152

GEN-CXXXXX-REVA

47 of 51

#### 23.2.18 SQ to CP: State report

| Name             | Direction | Bits | Description            |
|------------------|-----------|------|------------------------|
| SQ_CP_vs_event   | SQ→CP     | 1    | Vertex Shader Event    |
| SQ_CP_vs_eventid | SQ→CP     | 2    | Vertex Shader Event ID |
| SQ_CP_ps_event   | SQ→CP     | 1    | Pixel Shader Event     |
| SQ_CP_ps_eventid | SQ→CP     | 2    | Pixel Shader Event ID  |

eventid = 0 => \*sEndOfState (i.e. VsEndOfState) eventid = 1 => \*sDone (i.e. VsDone)

So, the CP will assume the Vs is done with a state whenever it gets a pulse on the SQ\_CP\_vs\_event and the SQ\_CP\_vs\_eventid = 0.

### 23.3 Example of control flow program execution

We now provide some examples of execution to better illustrate the new design.

#### Given the program:

Alu 0
Alu 1
Tex 0
Tex 1
Alu 3 Serial
Alu 4
Tex 2
Alu 5
Alu 6 Serial
Tex 3
Alu 7
Alloc Position 1 buffer
Alu 8 Export
Tex 4
Alloc Parameter 3 buffers
Alu 9 Export 0

Alu 10 Serial Export 2 Alu 11 Export 1 End

#### Would be converted into the following CF instructions:

Execute 0 Alu 0 Alu 0 Tex 0 Tex 1 Alu 0 Alu 0 Tex 0 Alu 1 Alu 0 Tex Execute 0 Alu Alloc Position 1 Execute 0 Alu 0 Tex Alloc Param 3 Execute end 0 Alu 0 Tex 1 Alu 0 Alu

#### And the execution of this program would look like this:

#### Put thread in Vertex RS:

Control Flow Instruction Pointer (12 bits), (CFP) Execution Count Marker (3 or 4 bits), (ECM) Loop Iterators (4x9 bits), (LI)
Call return pointers (4x12 bits), (CRP)
Predicate Bits(4x64 bits), (PB)
Export ID (1 bit), (EXID)

Exhibit 2032 doc R400\_Sequencer.doc 72136 Bytes\*\*\* © ATI Confidential. Reference Copyright Notice on Cover Page © \*\*\*



4 September, 20152

20152

R400 Sequencer Specification

48 of 51

GPR Base Ptr (8 bits), (GPR)
Export Base Ptr (7 bits), (EB)
Context Ptr (3 bits).(CPTR)
LOD correction bits (16x6 bits) (LOD)

| State Bit | S   |    |     |    |      | MINORAL MANAGEMENT AND |    |      | A DOUGHOUS AND A CONTRACTOR OF THE PARTY OF |
|-----------|-----|----|-----|----|------|------------------------------------------------------------|----|------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| CFP       | ECM | LI | CRP | PB | EXID | GPR                                                        | EB | CPTR | LOD                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
| 0         | 0   | 0  | 0   | 0  | 0    | 0                                                          | 0  | 0    | 0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |

Valid Thread (VALID)

Texture/ALU engine needed (TYPE)

Texture Reads are outstanding (PENDING)

Waiting on Texture Read to Complete (SERIAL)

Allocation Wait (2 bits) (ALLOC)

00 - No allocation needed

01 - Position export allocation needed (ordered export)

10 - Parameter or pixel export needed (ordered export)

11 - pass thru (out of order export)

Allocation Size (4 bits) (SIZE)

Position Allocated (POS ALLOC)

First thread of a new context (FIRST)

Last (1 bit), (LAST)

| Status Bits |      |         |        |       |      |           |       |      |
|-------------|------|---------|--------|-------|------|-----------|-------|------|
| VALID       | TYPE | PENDING | SERIAL | ALLOC | SIZE | POS_ALLOC | FIRST | LAST |
| 1           | ALU  | 0       | 0      | 0     | 0    | 0         | 1     | 0    |

Then the thread is picked up for the execution of the first control flow instruction:

Execute 0 Alu 0 Alu 0 Tex 0 Tex 1 Alu 0 Alu 0 Tex 0 Alu 1 Alu 0 Tex

It executes the first two ALU instructions and goes back to the RS for a resource request change. Here is the state returned to the RS:

| State Bits |     |    |     |    | HAVWANG TO DO MAN TO ANALON FOR THE PARTY OF | NEW YOUNG GOOD WATER TO SEE THE SEE TH | 50000525-3300000000000000000000000000000 |      |     |
|------------|-----|----|-----|----|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------|------|-----|
| CFP        | ECM | LI | CRP | РВ | EXID                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           | GPR                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            | EB                                       | CPTR | LOD |
| 0          | 2   | 0  | 0   | 0  | 0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | 0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | 0                                        | 0    | 0   |

| Status Bits | >    |         |        |       |      |           |       |      |
|-------------|------|---------|--------|-------|------|-----------|-------|------|
| VALID       | TYPE | PENDING | SERIAL | ALLOC | SIZE | POS_ALLOC | FIRST | LAST |
| 1           | TEX  | 0       | 0      | 0     | 0    | 0         | 1     | 0    |

Then when the texture pipe frees up, the arbiter picks up the thread to issue the texture reads. The thread comes back in this state:

| State Bit | S   |    |     |    |      |     | WXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX |      |     |
|-----------|-----|----|-----|----|------|-----|----------------------------------------|------|-----|
| CFP       | ECM | LI | CRP | PB | EXID | GPR | EB                                     | CPTR | LOD |
| 0         | 4   | 0  | 0   | 0  | 0    | 0   | 0                                      | 0    | 0   |

| Status Bits |      |         |        |       |      |           |       |      |
|-------------|------|---------|--------|-------|------|-----------|-------|------|
| VALID       | TYPE | PENDING | SERIAL | ALLOC | SIZE | POS_ALLOC | FIRST | LAST |
| 1           | ALU  | 1       | 1      | 0     | 0    | 0         | 1     | 0    |

Exhibit 2032.docR400\_Sequencer.doc 72136 Bytes\*\*\* © ATI Confidential. Reference Copyright Notice on Cover Page © \*\*\*



001 <u>4 September, 20152</u>

GEN-CXXXXX-REVA

49 of 51

Because of the serial bit the arbiter must wait for the texture to return and clear the PENDING bit before it can pick the thread up. Lets say that the texture reads are complete, then the arbiter picks up the thread and returns it in this state.

| State Bits |     |   |     |    |      |     |    |      |     |
|------------|-----|---|-----|----|------|-----|----|------|-----|
| CFP        | ECM | Ш | CRP | РВ | EXID | GPR | EB | CPTR | LOD |
| 0          | 6   | 0 | 0   | 0  | 0    | 0   | 0  | 0    | 0   |

| Status Bit | s    |         |        |       |      |           |       |      |
|------------|------|---------|--------|-------|------|-----------|-------|------|
| VALID      | TYPE | PENDING | SERIAL | ALLOC | SIZE | POS_ALLOC | FIRST | LAST |
| 1          | TEX  | 0       | 0      | 0     | 0    | 0         | 1     | 0    |

Again the TP frees up, the arbiter picks up the thread and executes. It returns in this state:

| State Bits |     |    |     |    |      |     |    |      |     |  |  |
|------------|-----|----|-----|----|------|-----|----|------|-----|--|--|
| CFP        | ECM | LI | CRP | РВ | EXID | GPR | EB | CPTR | LOD |  |  |
| 0          | 7   | 0  | 0   | 0  | 0    | 0   | 0  | 0    | 0   |  |  |

| Status Bits |      |         |        |       |      |           |       |      |
|-------------|------|---------|--------|-------|------|-----------|-------|------|
| VALID       | TYPE | PENDING | SERIAL | ALLOC | SIZE | POS_ALLOC | FIRST | LAST |
| 1           | ALU  | 1       | 0      | 0     | 0    | 0         | 1     | 0    |

Now, even if the texture has not returned we can still pick up the thread for ALU execution because the serial bit is not set. The thread will however come back to the RS for the second ALU instruction because it has the serial bit set.

| State Bi | ts  |    |     |    |      |     |    |      |     |
|----------|-----|----|-----|----|------|-----|----|------|-----|
| CFP      | ECM | LI | CRP | PB | EXID | GPR | EB | CPTR | LOD |
| 0        | 8   | 0  | 0   | 0  | 0    | 0   | 0  | 0    | 0   |

| Status Bit | ts   |         |        |       |      |           |       |      |
|------------|------|---------|--------|-------|------|-----------|-------|------|
| VALID      | TYPE | PENDING | SERIAL | ALLOC | SIZE | POS_ALLOC | FIRST | LAST |
| 1          | ALU  | 1       | 1      | 0     | 0    | 0         | 1     | 0    |

As soon as the TP clears the pending bit the thread is picked up and returns:

| State Bi | its |    |     |    |      |     |    |      |     |
|----------|-----|----|-----|----|------|-----|----|------|-----|
| CFP      | ECM | LI | CRP | PB | EXID | GPR | EB | CPTR | LOD |
| 0        | 9   | 0  | 0   | 0  | 0    | 0   | 0  | 0    | 0   |

| Status Bits |      |         |        |       |      |           |       |      |
|-------------|------|---------|--------|-------|------|-----------|-------|------|
| VALID       | TYPE | PENDING | SERIAL | ALLOC | SIZE | POS_ALLOC | FIRST | LAST |
| 1           | TEX  | 0       | 0      | 0     | 0    | 0         | 1     | 0    |

Picked up by the TP and returns:

Execute 0 Alu



#### ORIGINATE DATE 24 September, 2001

4 September, 20152

R400 Sequencer Specification

PAGE 50 of 51

| State B | its |    |     |    |      |     |    |      |     |
|---------|-----|----|-----|----|------|-----|----|------|-----|
| CFP     | ECM | LI | CRP | PB | EXID | GPR | EB | CPTR | LOD |
| 1       | 0   | 0  | 0   | 0  | 0    | 0   | 0  | 0    | 0   |

| Status Bits |      |         |        | *************************************** |      |           |       |      |
|-------------|------|---------|--------|-----------------------------------------|------|-----------|-------|------|
| VALID       | TYPE | PENDING | SERIAL | ALLOC                                   | SIZE | POS_ALLOC | FIRST | LAST |
| 1           | ALU  | 1       | 0      | 0                                       | 0    | 0         | 1     | 0    |

Picked up by the ALU and returns (lets say the TP has not returned yet): Alloc Position 1

| State Bits |     |    |     |    |      |     |    |      |     |  |  |
|------------|-----|----|-----|----|------|-----|----|------|-----|--|--|
| CFP        | ECM | LI | CRP | РВ | EXID | GPR | EB | CPTR | LOD |  |  |
| 2          | 0   | 0  | 0   | 0  | 0    | 0   | 0  | 0    | 0   |  |  |

| Status Bit | S    |         |        |       |      |           |       |      |
|------------|------|---------|--------|-------|------|-----------|-------|------|
| VALID      | TYPE | PENDING | SERIAL | ALLOC | SIZE | POS_ALLOC | FIRST | LAST |
| 1          | ALU  | 1       | 0      | 01    | 1    | 0         | 1     | 0    |

If the SX has the place for the export, the SQ is going to allocate and pick up the thread for execution. It returns to the RS in this state:

Execute 0 Alu 0 Tex

| State Bits |     |   |     |    |      |     |    |      |     |  |  |
|------------|-----|---|-----|----|------|-----|----|------|-----|--|--|
| CFP        | ECM | Ш | CRP | РВ | EXID | GPR | EB | CPTR | LOD |  |  |
| 3          | 1   | 0 | 0   | 0  | 0    | 0   | 0  | 0    | 0   |  |  |

| Status Bi | ts   |         |        |       |      |           |       |      |
|-----------|------|---------|--------|-------|------|-----------|-------|------|
| VALID     | TYPE | PENDING | SERIAL | ALLOC | SIZE | POS_ALLOC | FIRST | LAST |
| 1         | TEX  | 1       | 0      | 0     | 0    | 1         | 1     | 0    |

Now, since the TP has not returned yet, we must wait for it to return because we cannot issue multiple texture requests. The TP returns, clears the PENDING bit and we proceed:

Alloc Param 3

| State Bits |     |    |     |    |      |     |    |      |     |
|------------|-----|----|-----|----|------|-----|----|------|-----|
| CFP        | ECM | LI | CRP | РВ | EXID | GPR | EB | CPTR | LOD |
| 4          | 0   | 0  | 0   | 0  | 1    | 0   | 0  | 0    | 0   |

| Status Bits | S    |         |        |       |      |           |       |      |
|-------------|------|---------|--------|-------|------|-----------|-------|------|
| VALID       | TYPE | PENDING | SERIAL | ALLOC | SIZE | POS_ALLOC | FIRST | LAST |
| 1           | ALU  | 1       | 0      | 10    | 3    | 1         | 1     | 0    |

Once again the SQ makes sure the SX has enough room in the Parameter cache before it can pick up this thread.

Execute\_end 0 Alu 0 Tex 1 Alu 0 Alu

Exhibit 2032 docR400\_Sequencer.doc 72136 Bytes\*\*\* © ATI Confidential. Reference Copyright Notice on Cover Page © \*\*\*



| Status Bit | S    |         |        |       |      |           |       |      |
|------------|------|---------|--------|-------|------|-----------|-------|------|
| VALID      | TYPE | PENDING | SERIAL | ALLOC | SIZE | POS_ALLOC | FIRST | LAST |
| 1          | TEX  | 1       | 0      | 0     | 0    | 1         | 1     | 0    |

This executes on the TP and then returns:

| State Bits |     |    |     |    |      |     |     |      |     |
|------------|-----|----|-----|----|------|-----|-----|------|-----|
| CFP        | ECM | LI | CRP | РВ | EXID | GPR | EB  | CPTR | LOD |
| 5          | 2   | 0  | 0   | 0  | 1    | 0   | 100 | 0    | 0   |

| Status Bit | DANGE MAKANAN KANDAN KA |         |        |       |      |           |       |      |
|------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------|--------|-------|------|-----------|-------|------|
| VALID      | TYPE                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           | PENDING | SERIAL | ALLOC | SIZE | POS_ALLOC | FIRST | LAST |
| 1          | ALU                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            | 1       | 1      | 0     | 0    | 1         | 1     | 1    |

Waits for the TP to return because of the textures reads are pending (and SERIAL in this case). Then executes and does not return to the RS because the LAST bit is set. This is the end of this thread and before dropping it on the floor, the SQ notifies the SX of export completion.

#### 24. Open issues

Need to do some testing on the size of the register file as well as on the register file allocation method (dynamic VS static).

Saving power?

| Ali       | ORIGINATE DATE 24 September, 2001 | EDIT DATE 4 September, 20159 | DOCUMENT-REV. NUM.<br>GEN-CXXXXX-REVA | PAGE<br>1 of 53 |
|-----------|-----------------------------------|------------------------------|---------------------------------------|-----------------|
| Author:   | Laurent Lefebvre                  | Santombar 20022              |                                       |                 |
| Issue To: |                                   | Copy No:                     |                                       |                 |
|           | R400 S                            | equencer Spe                 | ecification                           |                 |
|           | R400 S                            | equencer Spe<br>SQ           | ecification                           |                 |
|           | R400 S                            | -                            | ecification                           |                 |

**AUTOMATICALLY UPDATED FIELDS:** 

Document Location: C:\perforce\r400\doc\_lib\design\blocks\sq\R400\_Sequencer.doc

**Current Intranet Search Title:** R400 Sequencer Specification

|           | APPROVALS      |
|-----------|----------------|
| Name/Dept | Signature/Date |
|           |                |
|           |                |
|           |                |
|           |                |
|           |                |

Remarks:

THIS DOCUMENT CONTAINS CONFIDENTIAL INFORMATION THAT COULD BE SUBSTANTIALLY DETRIMENTAL TO THE INTEREST OF ATI TECHNOLOGIES INC. THROUGH UNAUTHORIZED USE OR DISCLOSURE.

"Copyright 2001, ATI Technologies Inc. All rights reserved. The material in this document constitutes an unpublished work created in 2001. The use of this copyright notice is intended to provide notice that ATI owns a copyright in this unpublished work. The copyright notice is not an admission that publication has occurred. This work contains confidential, proprietary information and trade secrets of ATI. No part of this document may be used, reproduced, or transmitted in any form or by any means without the prior written permission of ATI Technologies Inc."

Exhibit 2033.docR400\_Sequencer.doc 73016 Bytes\*\*\* ® ATI Confidential. Reference Copyright Notice on Cover Page ® \*\*\*

ATI 2033 LG v. ATI IPR2015-00325



ORIGINATE DATE 24 September, 2001 EDIT DATE

4 September, 20159

R400 Sequencer Specification

PAGE 2 of 53

#### Table Of Contents

| 1.          | OVERVIEW                                                 | 97               |
|-------------|----------------------------------------------------------|------------------|
| 1.1         | Top Level Block Diagram                                  | 119              |
| 1.2         | Data Flow graph (SP)                                     |                  |
| 1.3         | Control Graph                                            | 1311             |
| 2.          | INTERPOLATED DATA BUS                                    | 1311             |
| 3.          | INSTRUCTION STORE                                        | 1614             |
| 4           | SEQUENCER INSTRUCTIONS                                   | 1614             |
| 5.          | CONSTANT STORES                                          | 1614             |
| 5.1         | Memory organizations                                     | 16 <del>14</del> |
| 5.2         | Management of the Control Flow Constants                 | 17 <del>15</del> |
| 5.3         | Management of the re-mapping tables                      |                  |
|             | 3.1 R400 Constant management                             |                  |
| A44499990   | 3.2 Proposal for R400LE constant management              | 17 <del>15</del> |
| 5.          | 3.3 Dirty bits                                           | 1917             |
| <u>5</u>    | 3.4 Free List Block                                      | 1917             |
| <u>5.</u>   | 3.5 De-allocate Block                                    | 2048             |
|             | 3.6 Operation of Incremental model                       |                  |
| 5.4         | Constant Store Indexing                                  | 2018             |
| 5.5         | Real Time Commands                                       |                  |
| 5.6         | Constant Waterfalling                                    | 21 <del>19</del> |
| 6.          | LOOPING AND BRANCHES                                     | 2220             |
| 6.1         | The controlling state.                                   | 2220             |
| 6.2         | The Control Flow Program                                 |                  |
| <u>6.</u> 2 | 2.1 Control flow instructions table                      | 2321             |
| 6.3         | Implementation                                           | 2523             |
| 6.4         | Data dependant predicate instructions                    |                  |
| 6.5         | HW Detection of PV,PS.                                   |                  |
| 6.6         | Register file indexing                                   |                  |
| 6.7         | Debugging the Shaders.                                   | 28 <del>25</del> |
| 6.          | 7.1 Method 1: Debugging registers                        |                  |
| 6.          | 7.2 Method 2: Exporting the values in the GPRs           |                  |
| 7.          | PIXEL KILL MASK                                          | 2826             |
| 8.          | MULTIPASS VERTEX SHADERS (HOS)                           | 2926             |
| 9.          | REGISTER FILE ALLOCATION                                 | 2926             |
| <u>10.</u>  | FETCH ARBITRATION                                        | 3028             |
| 11.         | ALU ARBITRATION                                          | 3028             |
| <u>12.</u>  | HANDLING STALLS                                          | 3129             |
| <u>13.</u>  | CONTENT OF THE RESERVATION STATION FIFOS THE OUTPUT FILE | 31 <u>29</u>     |
| <u>14.</u>  | THE OUTPUT FILE                                          | 3129             |
| 15.         | IJ FORMAT                                                | 3129             |
| 15.1        | Interpolation of constant attributes                     | 31 <u>29</u>     |
| 16.         | STAGING REGISTERS                                        | 3230             |

|                                                                                                                                                                      | T                                                 |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |                                         | <del></del>                                                                                                          |
|----------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------|----------------------------------------------------------------------------------------------------------------------|
| A n                                                                                                                                                                  | ORIGINATE DATE                                    | EDIT DATE                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      | DOCUMENT-REV. NUM.                      | PAGE                                                                                                                 |
|                                                                                                                                                                      | 24 September, 2001                                | 4 September, 20159                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             | GEN-CXXXXX-REVA                         | 3 of 53                                                                                                              |
|                                                                                                                                                                      |                                                   |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |                                         | 3334                                                                                                                 |
|                                                                                                                                                                      | ort restrictions                                  |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |                                         | 3432                                                                                                                 |
| 17.1.1                                                                                                                                                               | Pixel exports:                                    |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |                                         | 3432                                                                                                                 |
| <u>17.1.2</u>                                                                                                                                                        | Vertex exports:                                   |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |                                         | 3432                                                                                                                 |
| 17.1.3                                                                                                                                                               |                                                   |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | ************************************    | 3432                                                                                                                 |
|                                                                                                                                                                      |                                                   | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |                                         | 3432<br>3432                                                                                                         |
| ***************************************                                                                                                                              | ex Shading                                        |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | ***********************                 | 3432                                                                                                                 |
| ***************************************                                                                                                                              | Shading                                           | · · · · · · · · · · · · · · · · · · ·                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |                                         | 3533                                                                                                                 |
| ***************************************                                                                                                                              |                                                   |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |                                         | 3533                                                                                                                 |
| ***************************************                                                                                                                              | <u>l time commands</u><br>tes/ XY screen coordina | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |                                         | 3533                                                                                                                 |
|                                                                                                                                                                      |                                                   | ales/ FB information                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |                                         | 35 <del>33</del><br>3634                                                                                             |
| 19.3.1                                                                                                                                                               | ~~~ <del>/~</del>                                 |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |                                         | 3634                                                                                                                 |
| 19.3.2                                                                                                                                                               |                                                   |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | *************************************** | 3634                                                                                                                 |
| ***************************************                                                                                                                              | E MANAGEMENT                                      |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |                                         | 3735                                                                                                                 |
|                                                                                                                                                                      |                                                   |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |                                         | 3735                                                                                                                 |
| ***************************************                                                                                                                              | DDRESS IMPORTS                                    |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |                                         | 3735                                                                                                                 |
| ***************************************                                                                                                                              | ex indexes imports                                | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |                                         | 3735<br>3735                                                                                                         |
| ***************************************                                                                                                                              |                                                   |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |                                         | 3836                                                                                                                 |
|                                                                                                                                                                      |                                                   |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |                                         | 3836                                                                                                                 |
| 23.2 SC 1                                                                                                                                                            | o SP Interfaces                                   |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |                                         | 3836                                                                                                                 |
| 23.2.1                                                                                                                                                               |                                                   |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | *************************************** |                                                                                                                      |
| Stand out a stand is the                                                                                                                                             | SC SP#                                            |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | *************************************** | 3836                                                                                                                 |
| 23.2.2                                                                                                                                                               | ***************************************           |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |                                         | 383 <del>6</del>                                                                                                     |
| XX1100310XX1100XX1100XX1100XX1100XX1100XX                                                                                                                            | SC SQ                                             |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |                                         | ***************************************                                                                              |
| 23.2.2<br>23.2.3                                                                                                                                                     | SC SQSQ to SX(SP): Interpo                        | plator bus                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |                                         | 4139                                                                                                                 |
| 23.2.2<br>23.2.3<br>23.2.4                                                                                                                                           | SC SQ                                             | plator bus                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |                                         | 3937<br>4139<br>4139                                                                                                 |
| 23.2.2<br>23.2.3<br>23.2.4<br>23.2.5                                                                                                                                 | SC SQ                                             | plator busgister Dataterface                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |                                         | 3937<br>4139<br>4139<br>4139                                                                                         |
| 23.2.2<br>23.2.3<br>23.2.4<br>23.2.5<br>23.2.6                                                                                                                       | SC SQ                                             | plator bus<br>gister Data<br>iterface                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |                                         | 3937<br>4139<br>4139<br>4139<br>4442                                                                                 |
| 23.2.2<br>23.2.3<br>23.2.4<br>23.2.5<br>23.2.6<br>23.2.7                                                                                                             | SC SQ                                             | plator bus<br>gister Data<br>terface<br>s                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |                                         | 3937<br>4139<br>4139<br>4139<br>4442<br>4442                                                                         |
| 23.2.2<br>23.2.3<br>23.2.4<br>23.2.5<br>23.2.6<br>23.2.7<br>23.2.8                                                                                                   | SC SQ                                             | plator bus<br>gister Data<br>terface<br>control                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |                                         | 3937<br>4139<br>4139<br>4139<br>4139<br>4442<br>4442<br>4543                                                         |
| 23.2.2<br>23.2.3<br>23.2.4<br>23.2.5<br>23.2.6<br>23.2.7<br>23.2.8<br>23.2.9                                                                                         | SC SQ                                             | plator bus gister Data terface control                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |                                         | 3937<br>4139<br>4139<br>4139<br>4442<br>4442<br>4543<br>4543                                                         |
| 23.2.2<br>23.2.3<br>23.2.4<br>23.2.5<br>23.2.6<br>23.2.7<br>23.2.8<br>23.2.9<br>23.2.10                                                                              | SC SQ                                             | plator bus gister Data terface control                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |                                         | 3937<br>4139<br>4139<br>4139<br>4442<br>4442<br>4543<br>4543<br>4543                                                 |
| 23.2.2<br>23.2.3<br>23.2.4<br>23.2.5<br>23.2.6<br>23.2.7<br>23.2.8<br>23.2.9<br>23.2.10<br>23.2.11                                                                   | SC SQ                                             | plator bus gister Data terface control s                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |                                         | 3937<br>4139<br>4139<br>4442<br>4442<br>4543<br>4543<br>4644<br>4644                                                 |
| 23.2.2<br>23.2.3<br>23.2.4<br>23.2.5<br>23.2.6<br>23.2.7<br>23.2.8<br>23.2.9<br>23.2.10<br>23.2.11<br>23.2.12                                                        | SC SQ                                             | colator bus colator bus colator bus colator bus control control colator bus control control control colator bus control contro |                                         | 3937<br>4139<br>4139<br>4139<br>4442<br>4442<br>4543<br>4543<br>4644<br>4644<br>4745                                 |
| 23.2.2<br>23.2.3<br>23.2.4<br>23.2.5<br>23.2.6<br>23.2.7<br>23.2.8<br>23.2.9<br>23.2.10<br>23.2.11<br>23.2.12<br>23.2.13                                             | SC SQ                                             | plator bus  egister Data  terface  control  s  ill  uto counter  ns  ddress load/ Predicate                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    | Set/Kill set.                           | 3937<br>4139<br>4139<br>4139<br>4442<br>4442<br>4543<br>4543<br>4644<br>4644<br>4745<br>4846                         |
| 23.2.2<br>23.2.3<br>23.2.4<br>23.2.5<br>23.2.6<br>23.2.7<br>23.2.8<br>23.2.9<br>23.2.10<br>23.2.11<br>23.2.12<br>23.2.13<br>23.2.14                                  | SC SQ                                             | plator bus  gister Data  terface  control  ill  ill  iuto counter  ns  ddress load/ Predicate  proadcast                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       | Set/Kill set.                           | 3937<br>4139<br>4139<br>4139<br>4442<br>4442<br>4543<br>4543<br>4644<br>4644<br>4745<br>4846                         |
| 23.2.2<br>23.2.3<br>23.2.4<br>23.2.5<br>23.2.6<br>23.2.7<br>23.2.8<br>23.2.9<br>23.2.10<br>23.2.11<br>23.2.12<br>23.2.13<br>23.2.14<br>23.2.15                       | SC SQ                                             | plator bus gister Data terface control s ill all auto counter as ddress load/ Predicate proadcast                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | Set/Kill set.                           | 3937<br>4139<br>4139<br>4442<br>4442<br>4543<br>4543<br>4644<br>4644<br>4745<br>4846<br>4846<br>4846                 |
| 23.2.2<br>23.2.3<br>23.2.4<br>23.2.5<br>23.2.6<br>23.2.7<br>23.2.8<br>23.2.9<br>23.2.10<br>23.2.11<br>23.2.12<br>23.2.13<br>23.2.14                                  | SC SQ                                             | plator bus egister Data sterface s control s ill suto counter ns ddress load/ Predicate proadcast                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | Set/Kill set.                           | 3937<br>4139<br>4139<br>4139<br>4442<br>4442<br>4543<br>4543<br>4644<br>4745<br>4846<br>4846<br>4946                 |
| 23.2.2<br>23.2.3<br>23.2.4<br>23.2.5<br>23.2.6<br>23.2.7<br>23.2.8<br>23.2.9<br>23.2.10<br>23.2.11<br>23.2.12<br>23.2.13<br>23.2.14<br>23.2.15<br>23.2.16<br>23.2.17 | SC SQ                                             | plator bus egister Data sterface control s s dill suto counter ns ddress load/ Predicate proadcast                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             | Set/Kill set.                           | 3937<br>4139<br>4139<br>4139<br>4442<br>4442<br>4543<br>4543<br>4644<br>4745<br>4846<br>4846<br>4946<br>4946<br>4947 |



ORIGINATE DATE 24 September, 2001 EDIT DATE

4 September, 20159
September, 20022

R400 Sequencer Specification

PAGE 4 of 53

| 1. OVERVIEW                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | 7  |
|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----|
| 2. V 1 m 1 V 1 1 m 1 V 1 1 m 1 V 1 1 m 1 V 1 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 V 1 m 1 M 1 M 1 M 1 M 1 M 1 M 1 M 1 M 1 M |    |
| 1.1 Top Level Block Diagram                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | 9  |
| 1.2 Data Flow graph (SP)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |    |
| 1.3 Control Graph                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    | 11 |
| 2. INTERPOLATED DATA BUS                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |    |
| 3. INSTRUCTION STORE                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 |    |
| 4. SEQUENCER INSTRUCTIONS                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |    |
| 5. CONSTANT STORES                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |    |
| 5.1 Memory organizations                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             | 14 |
| 5.2 Management of the Control Flow Constants                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |    |
| 5.3 Management of the re-mapping tables                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |    |
| 5.3.1 R400 Constant management                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |    |
| 5.3.2 Proposal for R400LE constant management                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |    |
| 5.3.3 Dirty bits                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |    |
| 5.3.4 Free List Block                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | 17 |
| 5.3.5 De-allocate Block                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |    |
| 5.3.6 Operation of Incremental model                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 | 18 |
| 5.4 Constant Store Indexing                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | 18 |
| 5.5 Real Time Commands                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | 19 |
| 5.6 Constant Waterfalling                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            | 19 |
| 6. LOOPING AND BRANCHES                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | 20 |
| 6.1 The controlling state                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            | 20 |
| 6.2 The Control Flow Program                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |    |
| 6.2.1 Control flow instructions table.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | 21 |
| 6.3 Implementation 6.4 Data dependant predicate instructions                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | 23 |
|                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |    |
| 6.5 HW Detection of PV,PS.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           | 25 |
| 6.6 Register file indexing                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           | 25 |
| 6.7 Debugging the Shaders                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |    |
| 6.7.1 Method 1: Debugging registers                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  |    |
| 6.7.2 Method 2: Exporting the values in the GPRs                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | 26 |
| 7. PIXEL KILL MASK                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   | 26 |
| 8. MULTIPASS VERTEX SHADERS (HOS)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    | 26 |
| 9. REGISTER FILE ALLOCATION                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |    |
| 10. FETCH ARBITRATION                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | 28 |
| 11. ALU ARBITRATION                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  |    |
| 12. HANDLING STALLS                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  | 29 |
| 13. CONTENT OF THE RESERVATION STATION FIFOS                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | 29 |
| 14. THE OUTPUT FILE                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  |    |
| 15. U FORMAT                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | 29 |
| 15.1 Interpolation of constant attributes                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |    |
| 16. STAGING REGISTERS                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | 30 |
| 17. THE PARAMETER CACHE                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |    |
| 17.1 Export restrictions                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             | 32 |



| A PA                                           | ORIGINATE DATE                    | EDIT DATE          | R400 Sequencer Specification | PAGE    |  |  |  |  |
|------------------------------------------------|-----------------------------------|--------------------|------------------------------|---------|--|--|--|--|
| 7,00                                           | 24 September, 2001                | 4 September, 20159 |                              | 6 of 53 |  |  |  |  |
| 24.2.16                                        | SQ to CP: RBBM bu                 | IS                 |                              | 45      |  |  |  |  |
| 24.2.17                                        | CP to SQ: RBBM bu                 | 16                 |                              | 45      |  |  |  |  |
| 24.2.18                                        | 24.2.18 SQ to CP: State report 45 |                    |                              |         |  |  |  |  |
| 24.3 Example of control flow program execution |                                   |                    |                              |         |  |  |  |  |
|                                                |                                   |                    |                              |         |  |  |  |  |



EDIT DATE 4 September, 20159 DOCUMENT-REV. NUM. GEN-CXXXXX-REVA

PAGE 7 of 53

## Revision Changes:

Rev 0.1 (Laurent Lefebvre) Date: May 7, 2001

Rev 0.2 (Laurent Lefebvre) Date: July 9, 2001 Rev 0.3 (Laurent Lefebvre) Date: August 6, 2001 Rev 0.4 (Laurent Lefebvre) Date: August 24, 2001

Rev 0.5 (Laurent Lefebvre) Date: September 7, 2001 Rev 0.6 (Laurent Lefebvre) Date: September 24, 2001 Rev 0.7 (Laurent Lefebvre) Date: October 5, 2001

Rev 0.8 (Laurent Lefebvre) Date: October 8, 2001 Rev 0.9 (Laurent Lefebvre) Date: October 17, 2001

Rev 1.0 (Laurent Lefebvre) Date: October 19, 2001 Rev 1.1 (Laurent Lefebvre) Date: October 26, 2001

Rev 1.2 (Laurent Lefebvre) Date: November 16, 2001 Rev 1.3 (Laurent Lefebvre) Date: November 26, 2001 Rev 1.4 (Laurent Lefebvre) Date: December 6, 2001

Rev 1.5 (Laurent Lefebvre) Date: December 11, 2001

Rev 1.6 (Laurent Lefebvre) Date: January 7, 2002

Rev 1.7 (Laurent Lefebvre) Date: February 4, 2002 Rev 1.8 (Laurent Lefebvre) Date: March 4, 2002

Rev 1.9 (Laurent Lefebvre) Date: March 18, 2002 Rev 1.10 (Laurent Lefebyre) Date: March 25, 2002 Rev 1.11 (Laurent Lefebvre) Date: April 19, 2002 Rev 2.0 (Laurent Lefebvre) Date: April 19, 2002

First draft

Changed the interfaces to reflect the changes in the SP. Added some details in the arbitration section. Reviewed the Sequencer spec after the meeting on August 3, 2001.

Added the dynamic allocation method for register file and an example (written in part by Vic) of the flow of pixels/vertices in the sequencer. Added timing diagrams (Vic)

Changed the spec to reflect the new R400 architecture. Added interfaces.

Added constant store management, instruction store management, control flow management and data dependant predication.

Changed the control flow method to be more flexible. Also updated the external interfaces. Incorporated changes made in the 10/18/01 control flow meeting. Added a NOP instruction, removed the conditional\_execute\_or\_jump. Added debug

registers. Refined interfaces to RB. Added state registers.

Added SEQ-SP0 interfaces. Changed delta precision. Changed VGT-SP0 interface. Debug Methods added.

Interfaces greatly refined. Cleaned up the spec.

Added the different interpolation modes.

Added the auto incrementing counters. Changed the VGT-SQ interface. Added content on constant management. Updated GPRs.

Removed from the spec all interfaces that weren't directly tied to the SQ. Added explanations on constant management. Added synchronization fields and explanation.

Added more details on the staging register. Added detail about the parameter caches. Changed the call instruction to a Conditionnal call instruction. Added details on constant management and updated the diagram.

Added Real Time parameter control in the SX interface. Updated the control flow section.

New interfaces to the SX block. Added the end of clause modifier, removed the end of clause instructions.

Rearangement of the CF instruction bits in order to ensure byte alignement.

Updated the interfaces and added a section on exporting rules.

Added CP state report interface. Last version of the spec with the old control flow scheme

New control flow scheme



EDIT DATE
4 September, 20159

R400 Sequencer Specification

PAGE 8 of 53

Rev 2.01 (Laurent Lefebvre) Date: May 2, 2002 Rev 2.02 (Laurent Lefebvre) Date: May 13, 2002

Rev 2.03 (Laurent Lefebvre) Date : July 15, 2002

Rev 2.04 (Laurent Lefebvre) Date :August 2, 2002 Rev 2.05 (Laurent Lefebvre) Date : Changed slightly the control flow instructions to allow force jumps and calls.

Updated the Opcodes. Added type field to the constant/pred interface. Added Last field to the SQ—SP instruction load interface.

SP interface updated to include predication optimizations. Added the predicate no stall instructions,

Documented the new parameter generation scheme for XY coordinates points and lines STs.



EDIT DATE
4 September, 20159

DOCUMENT-REV. NUM. GEN-CXXXXX-REVA

PAGE 9 of 53

## 1. Overview

The sequencer chooses two ALU threads and a fetch hread to execute, and executes all of the instructions in a block before looking for a new clause of the same type. Two ALU threads are executed interleaved to hide the ALU latency. The arbitrator will give priority to older threads. There are two separate reservation stations, one for pixel vectors and one for vertices vectors. This way a pixel can pass a vertex and a vertex can pass a pixel.

To support the shader pipe the sequencer also contains the shader instruction cache, constant store, control flow constants and texture state. The four shader pipes also execute the same instruction thus there is only one sequencer for the whole chip.

The sequencer first arbitrates between vectors of 64 vertices that arrive directly from primitive assembly and vectors of 16 quads (64 pixels) that are generated in the scan converter.

The vertex or pixel program specifies how many GPRs it needs to execute. The sequencer will not start the next vector until the needed space is available in the GPRs.



AMD1044\_0257616



EDIT DATE
4 September, 20159

DOCUMENT-REV. NUM. GEN-CXXXXX-REVA PAGE 11 of 53

# 1.1 Top Level Block Diagram



Figure 2: Reservation stations and arbiters

Under this new scheme, the sequencer (SQ) will only use one global state management machine per vector type (pixel, vertex) that we call the reservation station (RS).





EDIT DATE
4 September, 20159

DOCUMENT-REV. NUM. GEN-CXXXXX-REVA PAGE 13 of 53

The gray area represents blocks that are replicated 4 times per shader pipe (16 times on the overall chip).

# 1.3 Control Graph



Figure 4: Sequencer Control interfaces

In green is represented the Fetch control interface, in red the ALU control interface, in blue the Interpolated/Vector control interface and in purple is the output file control interface.

## 2. Interpolated data bus

The interpolators contain an IJ buffer to pack the information as much as possible before writing it to the register file.



|                    |                    | က္သ                     | 9       | l    | ļ          | i 1   |                           | ~ di 10                | 1 - 4 O                |                           |                                        |
|--------------------|--------------------|-------------------------|---------|------|------------|-------|---------------------------|------------------------|------------------------|---------------------------|----------------------------------------|
|                    |                    | T18 T19 T20 T21 T22 T23 |         |      |            |       | V V<br>32-48-<br>35-51    | V V<br>36-52-<br>39 55 | V V<br>40-56-<br>43 59 | V V<br>44-60-<br>47 63    | ×                                      |
|                    |                    | <u>T</u>                |         |      |            |       | 33.                       |                        |                        |                           |                                        |
|                    |                    | 7                       |         |      |            |       | > 16-<br>19               | 2,8                    | > 24-<br>27            | 28-31                     | _>                                     |
|                    |                    | )TZ                     |         |      |            |       | V <                       | > 4-7                  | >% ==                  | > 2 5                     | ************************************** |
|                    |                    | Ĕ                       |         |      |            |       |                           |                        | 0                      | П                         |                                        |
|                    |                    |                         |         |      |            |       |                           |                        |                        | 8                         |                                        |
| 出                  | 23                 | T17                     |         |      | ≳ 🖺        | ≿ଘ    |                           | 8                      | 2                      | 23                        |                                        |
| PAGE               | 15 of 53           | T16                     |         |      | EO         | 핍     |                           |                        |                        | B0                        | P2                                     |
|                    |                    | T15                     |         |      | 品          | 교     | 5                         | 22                     |                        |                           | Ω                                      |
| N<br>N             | Y.                 | T14                     | ≳⊼      | ≿8   |            | ≳8    | ຮ                         | 2                      | 55                     |                           |                                        |
| REV. I             | GEN-CXXXXX-REVA    | T13                     | 2       | 22   |            | 8     | B1                        |                        |                        |                           |                                        |
| MENT-I             |                    | T12                     | 5       | 22   |            | 20    | A0                        | A1                     | A2                     |                           |                                        |
| DOCUMENT-REV. NUM. |                    | T10 T11 T12             | Ĺ       | ≱ 2  | ≿ 53       | DEADO |                           |                        | 9                      | 页                         |                                        |
|                    |                    | T10                     | -       | 2    | 55         | ۵     |                           |                        |                        | 8                         |                                        |
|                    | 0310               | 6                       |         | 2    | CS         |       |                           | 8                      | 5                      | 22                        |                                        |
| TE TE              | 4 September, 20159 | 8                       | ≿ წ     |      | ≽ઇ         | ≿ ८   |                           |                        |                        | 80                        | dansam                                 |
| EDIT DATE          | embe               | 1                       | ឌ       | 8    | ઇ          | 22    | D1                        | 22                     |                        |                           | Ω                                      |
| Ш                  | 4 Sep              | 9 <u>T</u>              | ខ       | 8    | ઇ          | 2     | င္ပ                       | 2                      | C5                     |                           |                                        |
|                    |                    | T5                      | ¥ ₹     |      |            | ≿ 8   | 81                        |                        |                        |                           |                                        |
| ATE                | 2001               | 7                       | 20      |      |            | B0    | A0                        | A1                     | A2                     |                           |                                        |
| ORIGINATE DATE     | 24 September, 2001 | 5                       | B4      |      |            | B0    | XY XY<br>32- 48-<br>35 51 | X≺<br>52-<br>55        | X<br>56-<br>59         | ×<br>60-<br>63            |                                        |
| SIGIN              | Septe              | 72                      | ≿ 8     | ≽₹   | <b>≿</b> & |       | XY<br>32-<br>35           | ≽ ట్ల జ                | <b>≯</b> 4 €           | XY XY<br>44- 60-<br>47 63 | >                                      |
| ō                  | 24                 | F                       | A0      | A    | A2         |       | 5<br>19<br>19             | 3 % ≾                  | × 42<br>72             | 3 4 ₹                     | ×                                      |
| ΘÇ                 |                    | 2                       | AO      | A1   | A2         |       | ×<br>0-3                  | ×× 7-4                 | <b>≯</b> ∾ ±           | <u>₹</u> 2 ₹              |                                        |
| L.                 | ₹                  |                         | SP<br>0 | SP - | SP<br>2    | S c   | SP<br>0                   | 요 -                    | S <sub>P</sub>         | SP<br>3                   |                                        |

Figure 6: Interpolation timing diagram

gR400\_Sequencendos 73018 Bytes\*\*\* © ATI Confidential. Reference Copyright Notice on Cover Page © \*\*\*

AMD1044\_0257621



ORIGINATE DATE

EDIT DATE

R400 Sequencer Specification

PAGE 16 of 53

24 September, 2001

4 September, 20159

Above is an example of a tile the sequencer might receive from the SC. The write side is how the data get stacked into the XY and IJ buffers, the read side is how the data is passed to the GPRs. The IJ information is packed in the IJ buffer 4 quads at a time or two clocks. The sequencer allows at any given time as many as four quads to interpolate a parameter. They all have to come from the same primitive. Then the sequencer controls the write mask to the GPRs to write the valid data in.

### 3. Instruction Store

There is going to be only one instruction store for the whole chip. It will contain 4096 instructions of 96 bits each.

It is likely to be a 1 port memory; we use 1 clock to load the ALU instruction, 1 clocks to load the Fetch instruction, 1 clock to load 2 control flow instructions and 1 clock to write instructions.

The instruction store is loaded by the CP thru the register mapped registers.

The VS\_BASE and PS\_BASE context registers are used to specify for each context where its shader is in the instruction memory.

For the Real time commands the story is quite the same but for some small differences. There are no wrap-around points for real time so the driver must be careful not to overwrite regular shader data. The shared code (shared subroutines) uses the same path as real time.

## 4. Sequencer Instructions

All control flow instructions and move instructions are handled by the sequencer only. The ALUs will perform NOPs during this time (MOV PV,PV, PS,PS) if they have nothing else to do.

### 5. Constant Stores

### 5.1 Memory organizations

A likely size for the ALU constant store is 1024x128 bits. The read BW from the ALU constant store is 128 bits/clock and the write bandwidth is 32 bits/clock (directed by the CP bus size not by memory ports).

The maximum logical size of the constant store for a given shader is 256 constants. Or 512 for the pixel/vertex shader pair. The size of the re-mapping table is 128 lines (each line addresses 4 constants). The write granularity is 4 constants or 512 bits. It takes 16 clocks to write the four constants. Real time requires 256 lines in the physical memory (this is physically register mapped).

The texture state is also kept in a similar memory. The size of this memory is 320x96 bits (128 texture states for regular mode, 32 states for RT). The memory thus holds 128 texture states (192 bits per state). The logical size exposes 32 different states total, which are going to be shared between the pixel and the vertex shader. The size of the re-mapping table to for the texture state memory is 32 lines (each line addresses 1 texture state lines in the real memory). The CP write granularity is 1 texture state lines (or 192 bits). The driver sends 512 bits but the CP ignores the top 320 bits. It thus takes 6 clocks to write the texture state. Real time requires 32 lines in the physical memory (this is physically register mapped).

The control flow constant memory doesn't sit behind a renaming table. It is register mapped and thus the driver must reload its content each time there is a change in the control flow constants. Its size is 320\*32 because it must hold 8 copies of the 32 dwords of control flow constants and the loop construct constants must be aligned.

The constant re-mapping tables for texture state and ALU constants are logically register mapped for regular mode and physically register mapped for RT operation.



EDIT DATE
4 September, 20159

DOCUMENT-REV. NUM. GEN-CXXXXX-REVA

PAGE 17 of 53

### 5.2 Management of the Control Flow Constants

The control flow constants are register mapped, thus the CP writes to the according register to set the constant, the SQ decodes the address and writes to the block pointed by its current base pointer (CF\_WR\_BASE). On the read side, one level of indirection is used. A register (SQ\_CONTEXT\_MISC.CF\_RD\_BASE) keeps the current base pointer to the control flow block. This register is copied whenever there is a state change. Should the CP write to CF after the state change, the base register is updated with the (current pointer number +1)% number of states. This way, if the CP doesn't write to CF the state is going to use the previous CF constants.

## 5.3 Management of the re-mapping tables

### 5.3.1 R400 Constant management

The sequencer is responsible to manage two re-mapping tables (one for the constant store and one for the texture state). On a state change (by the driver), the sequencer will broadside copy the contents of its re-mapping tables to a new one. We have 8 different re-mapping tables we can use concurrently.

The constant memory update will be incremental, the driver only need to update the constants that actually changed between the two state changes.

For this model to work in its simplest form, the requirement is that the physical memory MUST be at least twice as large as the logical address space + the space allocated for Real Time. In our case, since the logical address space is 512 and the reserved RT space can be up to 256 entries, the memory must be of sizes 1280 and above. Similarly the size of the texture store must be of 32\*2+32 = 96 entries and above.

### 5.3.2 Proposal for R400LE constant management

To make this scheme work with only 512+256 = 768 entries, upon reception of a CONTROL packet of state + 1, the sequencer would check for SQ\_IDLE and PA\_IDLE and if both are idle will erase the content of state to replace it with the new state (this is depicted in <u>Figure 8: De-allocation mechanismFigure 8: De-allocation mechanismFigure 8: De-allocation mechanism</u>). Note that in the case a state is cleared a value of 0 is written to the corresponding de-allocation counter location so that when the SQ is going to report a state change, nothing will be de-allocated upon the first report.

The second path sets all context dirty bits that were used in the current state to 1 (thus allowing the new state to reuse these physical addresses if needed).





EDIT DATE
4 September, 20159

DOCUMENT-REV. NUM. GEN-CXXXXX-REVA PAGE 19 of 53



Figure 8: De-allocation mechanism for R400LE

## 5.3.3 Dirty bits

Two sets of dirty bits will be maintained per logical address. The first one will be set to zero on reset and set when the logical address is addressed. The second one will be set to zero whenever a new context is written and set for each address written while in this context. The reset dirty is not set, then writing to that logical address will not require de-allocation of whatever address stored in the renaming table. If it is set and the context dirty is not set, then the physical address store needs to be de-allocated and a new physical address is necessary to store the incoming data. If they are both set, then the data will be written into the physical address held in the renaming for the current logical address. No de-allocation or allocation takes place. This will happen when the driver does a set constant twice to the same logical address between context changes. NOTE: It is important to detect and prevent this, failure to do it will allow multiple writes to allocate all physical memory and thus hang because a context will not fit for rendering to start and thus free up space.

#### 5.3.4 Free List Block

A free list block that would consist of a counter (called the IFC or Initial Free Counter) that would reset to zero and incremented every time a chunk of physical memory is used until they have all been used once. This counter would be checked each time a physical block is needed, and if the original ones have not been used up, us a new one, else check the free list for an available physical block address. The count is the physical address for when getting a chunk from the counter.

Storage of a free list big enough to store all physical block addresses.

Maintain three pointers for the free list that are reset to zero. The first one we will call write\_ptr. This pointer will identify the next location to write the physical address of a block to be de-allocated. Note: we can never free more physical memory locations than we have. Once recording address the pointer will be incremented to walk the free list like a ring.

The second pointer will be called stop\_ptr. The stop\_ptr pointer will be advanced by the number of address chunks de-allocates when a context finishes. The address between the stop\_ptr and write\_ptr cannot be reused because they are still in use. But as soon as the context using then is dismissed the stop\_ptr will be advanced.

The third pointer will be called read\_ptr. This pointer will point will point to the next address that can be used for allocation as long as the read\_ptr does not equal the stop\_ptr and the IFC is at its maximum count.



EDIT DATE

R400 Sequencer Specification

PAGE 20 of 53

4 September, 20159

### 5.3.5 De-allocate Block

This block will maintain a free physical address block count for each context. While in current context, a count shall be maintained specifying how many blocks were written into the free list at the write\_ptr pointer. This count will be reset upon reset or when this context is active on the back and different than the previous context. It is actually a count of blocks in the previous context that will no longer be used. This count will be used to advance the write\_ptr pointer to make available the set of physical blocks freed when the previous context was done. This allows the discard or de-allocation of any number of blocks in one clock.

#### 5.3.6 Operation of Incremental model

The basic operation of the model would start with the write\_ptr, stop\_ptr, read\_ptr pointers in the free list set to zero and the free list counter is set to zero. Also all the dirty bits and the previous context will be initialized to zero. When the first set constants happen, the reset dirty bit will not be set, so we will allocate a physical location from the free list counter because its not at the max value. The data will be written into physical address zero. Both the additional copy of the renaming table and the context zeros of the big renaming table will be updated for the logical address that was written by set start with physical address of 0. This process will be repeated for any logical address that are not dirty until the context changes. If a logical address is hit that has its dirty bits set while in the same context, both dirty bits would be set, so the new data will be over-written to the last physical address assigned for this logical address. When the first draw command of the context is detected, the previous context stored in the additional renaming table will be copied to the larger renaming table in the current (new) context location. Then the set constant logical address with be loaded with a new physical address during the copy and if the reset dirty was set, the physical address it replaced in the renaming table would be entered at the write\_ptr pointer location on the free list and the write\_ptr will be incremented. The de-allocation counter for the previous context (eight) will be incremented. This as set states come in for this context one of the following will happen:

- No dirty bits are set for the logical address being updated. A line will be allocated of the free-list counter or the free list at read\_ptr pointer if read\_ptr != to stop\_ptr.
- 2.) Reset dirty set and Context dirty not set. A new physical address is allocated, the physical address in the renaming table is put on the free list at write\_ptr and it is incremented along with the de-allocate counter for the last context
- 3.) Context dirty is set then the data will be written into the physical address specified by the logical address.

This process will continue as long as set states arrive. This block will provide backpressure to the CP whenever he has not free list entries available (counter at max and stop\_ptr == read\_ptr). The command stream will keep a count of contexts of constants in use and prevent more than max constants contexts from being sent.

Whenever a draw packet arrives, the content of the re-mapping table is written to the correct re-mapping table for the context number. Also if the next context uses less constants than the current one all exceeding lines are moved to the free list to be de-allocated later. This happens in parallel with the writing of the re-mapping table to the correct memory.

Now preferable when the constant context leaves the last ALU clause it will be sent to this block and compared with the previous context that left. (Init to zero) If they differ than the older context will no longer be referenced and thus can be de-allocated in the physical memory. This is accomplished by adding the number of blocks freed this context to the stop\_ptr pointer. This will make all the physical addresses used by this context available to the read\_ptr allocate pointer for future allocation.

This device allows representation of multiple contexts of constants data with N copies of the logical address space. It also allows the second context to be represented as the first set plus some new additional data by just storing the delta's. It allows memory to be efficiently used and when the constants updates are small it can store multiple context. However, if the updates are large, less contexts will be stored and potentially performance will be degraded. Although it will still perform as well as a ring could in this case.

### 5.4 Constant Store Indexing

In order to do constant store indexing, the sequencer must be loaded first with the indexes (that come from the GPRs). There are 144 wires from the exit of the SP to the sequencer (9 bits pointers x 16 vertexes/clock). Since the data must pass thru the Shader pipe for the float to fixed conversion, there is a latency of 4 clocks (1 instruction)



ADD

ORIGINATE DATE 24 September, 2001 EDIT DATE 4 September, 20159 DOCUMENT-REV. NUM.
GEN-CXXXXX-REVA

PAGE 21 of 53

between the time the sequencer is loaded and the time one can index into the constant store. The assembly will look like this

MOVA R1.X,R2.X // Loads the sequencer with the content of R2.X, also copies the content of R2.X into R1.X NOP // latency of the float to fixed conversion

R3,R4,C0[R2.X]// Uses the state from the sequencer to add R4 to C0[R2.X] into R3

Note that we don't really care about what is in the brackets because we use the state from the MOVA instruction. R2.X is just written again for the sake of simplicity and coherency.

The storage needed in the sequencer in order to support this feature is 2\*64\*9 bits = 1152 bits.

#### 5.5 Real Time Commands

The real time commands constants are written by the CP using the register mapped registers allocated for RT. It works is the same way than when dealing with regular constant loads BUT in this case the CP is not sending a logical address but rather a physical address and the reads are not passing thru the re-mapping table but are directly read from the memory. The boundary between the two zones is defined by the CONST\_EO\_RT control register. Similarly, for the fetch state, the boundary between the two zones is defined by the TSTATE\_EO\_RT control register.

## 5.6 Constant Waterfalling

In order to have a reasonable performance in the case of constant store indexing using the address register, we are going to have the possibility of using the physical memory port for read only. This way we can read 1 constant per clock and thus have a worst-case waterfall mode of 1 vertex per clock. There is a small synchronization issue related with this as we need for the SQ to make sure that the constants where actually written to memory (not only sent to the sequencer) before it can allow the first vector of pixels or vertices of the state to go thru the ALUs. To do so, the sequencer keeps 8 bits (one per render state) and sets the bits whenever the last render state is written to memory and clears the bit whenever a state is freed.



Figure 9: The Constant store



EDIT DATE

R400 Sequencer Specification

PAGE 22 of 53

4 September, 20159

### 6. Looping and Branches

Loops and branches are planned to be supported and will have to be dealt with at the sequencer level. We plan on supporting constant loops and branches using a control program.

## 6.1 The controlling state.

The R400 controling state consists of:

Boolean[256:0] Loop\_count[7:0][31:0] Loop\_Start[7:0][31:0] Loop\_Step[7:0][31:0]

That is 256 Booleans and 32 loops.

We have a stack of 4 elements for nested calls of subroutines and 4 loop counters to allow for nested loops.

This state is available on a per shader program basis.

### 6.2 The Control Flow Program

We'd like to be able to code up a program of the form:

1: Loop
2: Exec TexFetch
3: TexFetch
4: ALU
5: ALU
6: TexFetch
7: End Loop
8: ALU Export

But realize that 3: may be dependent on 2: and 4: is almost certainly dependent on 2: and 3:. Without clausing, these dependencies need to be expressed in the Control Flow instructions. Additionally, without separate 'texture clauses' and 'ALU clauses' we need to know which instructions to dispatch to the Texture Unit and which to the ALU unit. This information will be encapsulated in the flow control instructions.

Each control flow instruction will contain 2 bits of information for each (non-control flow) instruction:

- a) ALU or Texture
- b) Serialize Execution

(b) would force the thread to stop execution at this point (before the instruction is executed) and wait until all textures have been fetched. Given the allocation of reserved bits, this would mean that the count of an 'Exec' instruction would be limited to about 8 (non-control-flow) instructions. If more than this were needed, a second Exec (with the same conditions) would be issued.

Another function that relies upon 'clauses' is allocation and order of execution. We need to assure that pixels and vertices are exported in the correct order (even if not all execution is ordered) and that space in the output buffers are allocated in order. Additionally data can't be exported until space is allocated. A new control flow instruction:

Alloc <buffer select -- position,parameter, pixel or vertex memory. And the size required>.

would be created to mark where such allocation needs to be done. To assure allocation is done in order, the actual allocation for a given thread can not be performed unless the equivalent allocation for all previous threads is already completed. The implementation would also assure that execution of instruction(s) following the serialization due to the Alloc will occur in order -- at least until the next serialization or change from ALU to Texture. In most cases this will allow the exports to occur without any further synchronization. Only 'final' allocations or position allocations are



EDIT DATE
4 September, 20159

DOCUMENT-REV. NUM. GEN-CXXXXX-REVA PAGE 23 of 53

guaranteed to be ordered. Because strict ordering is required for pixels, parameters and positions, this implies only a single alloc for these structures. Vertex exports to memory do not require ordering during allocation and so multiple 'allocs' may be done.

#### 6.2.1 Control flow instructions table

Here is the revised control flow instruction set.

Note that whenever a field is marked as RESERVED, it is assumed that all the bits of the field are cleared (0).

|       |            | NOP      | 1 |
|-------|------------|----------|---|
| 47 44 | 43         | 42 0     |   |
| 0000  | Addressing | RESERVED |   |

This is a regular NOP.

|       | Execute    |          |                                  |       |              |  |  |  |  |
|-------|------------|----------|----------------------------------|-------|--------------|--|--|--|--|
| 47 44 | 43         | 40 34    | 3316                             | 1512  | 11 0         |  |  |  |  |
| 0001  | Addressing | RESERVED | Instructions type + serialize (9 | Count | Exec Address |  |  |  |  |
|       |            |          | instructions)                    |       |              |  |  |  |  |

|   | Execute_End |            |          |                                  |       |              |  |  |  |
|---|-------------|------------|----------|----------------------------------|-------|--------------|--|--|--|
| 4 | 47 44       | 43         | 40 34    | 3316                             | 1512  | 11 0         |  |  |  |
|   | 0010        | Addressing | RESERVED | Instructions type + serialize (9 | Count | Exec Address |  |  |  |
|   |             |            |          | instructions)                    |       |              |  |  |  |

Execute up to 9 instructions at the specified address in the instruction memory. The Instruction type field tells the sequencer the type of the instruction (LSB) (1 = Texture, 0 = ALU and whether to serialize or not the execution (MSB) (1 = Serialize, 0 = Non-Serialized). If Execute\_End this is the last execution block of the shader program.

|       | Conditional_Execute |           |         |                                  |       |              |  |  |  |  |
|-------|---------------------|-----------|---------|----------------------------------|-------|--------------|--|--|--|--|
| 47 44 | 43                  | 42        | 41 34   | 3316                             | 1512  | 11 0         |  |  |  |  |
| 0011  | Addressing          | Condition | Boolean | Instructions type + serialize (9 | Count | Exec Address |  |  |  |  |
|       |                     |           | address | instructions)                    |       |              |  |  |  |  |

| Conditional_Execute_End |            |           |         |                                  |       |              |  |  |  |
|-------------------------|------------|-----------|---------|----------------------------------|-------|--------------|--|--|--|
| 47 44                   | 43         | 42        | 41 34   | 3316                             | 15 12 | 11 0         |  |  |  |
| 0100                    | Addressing | Condition | Boolean | Instructions type + serialize (9 | Count | Exec Address |  |  |  |
|                         | _          |           | address | instructions)                    |       |              |  |  |  |

If the specified Boolean (8 bits can address 256 Booleans) meets the specified condition then execute the specified instructions (up to 9 instructions). If the condition is not met, we go on to the next control flow instruction. If Conditional\_Execute\_End and the condition is met, this is the last execution block of the shader program.

|       | Conditional_Execute_Predicates |           |          |                  |                                  |       |              |  |  |  |  |
|-------|--------------------------------|-----------|----------|------------------|----------------------------------|-------|--------------|--|--|--|--|
| 47 44 | 43                             | 42        | 41 36    | 35 34            | 3316                             | 1512  | 11 0         |  |  |  |  |
| 0101  | Addressing                     | Condition | RESERVED | Predicate vector | Instructions<br>type + serialize | Count | Exec Address |  |  |  |  |
|       |                                |           |          |                  | (9 instructions)                 |       |              |  |  |  |  |

|       | Conditional_Execute_Predicates_End |           |          |                  |                                  |       |              |  |  |  |
|-------|------------------------------------|-----------|----------|------------------|----------------------------------|-------|--------------|--|--|--|
| 47 44 | 43                                 | 42        | 41 36    | 35 34            | 3316                             | 1512  | 11 0         |  |  |  |
| 0110  | Addressing                         | Condition | RESERVED | Predicate vector | Instructions<br>type + serialize | Count | Exec Address |  |  |  |
|       |                                    |           |          |                  | (9 instructions)                 |       |              |  |  |  |

Check the AND/OR of all current predicate bits. If AND/OR matches the condition execute the specified number of instructions. We need to AND/OR this with the kill mask in order not to consider the pixels that aren't valid. If the



EDIT DATE 4 September, 20159 R400 Sequencer Specification

PAGE 24 of 53

condition is not met, we go on to the next control flow instruction. If Conditional\_Execute\_Predicates\_End and the condition is met, this is the last execution block of the shader program.

|       | Conditional_Execute_Predicates_No_Stall |           |          |                     |                                                |       |              |  |  |  |
|-------|-----------------------------------------|-----------|----------|---------------------|------------------------------------------------|-------|--------------|--|--|--|
| 47 44 | 43                                      | 42        | 41 36    | 35 34               | 3316                                           | 1512  | 11 0         |  |  |  |
| 1101  | Addressing                              | Condition | RESERVED | Predicate<br>vector | Instructions type + serialize (9 instructions) | Count | Exec Address |  |  |  |

|       | Conditional_Execute_Predicates_No_Stall_End |           |          |                     |                                                |       |              |  |  |  |  |
|-------|---------------------------------------------|-----------|----------|---------------------|------------------------------------------------|-------|--------------|--|--|--|--|
| 47 44 | 43                                          | 42        | 41 36    | 35 34               | 3316                                           | 1512  | 11 0         |  |  |  |  |
| 1110  | Addressing                                  | Condition | RESERVED | Predicate<br>vector | Instructions type + serialize (9 instructions) | Count | Exec Address |  |  |  |  |

Same as Conditionnal\_Execute\_Predicates but the SQ is not going to wait for the predicate vector to be updated. You can only set this in the compiler if you know that the predicate set is only a refinement of the current one (like a nested if) because the optimization would still work.

| Loop_Start |            |          |         |          |              |  |  |  |  |
|------------|------------|----------|---------|----------|--------------|--|--|--|--|
| 47 44      | 43         | 42 21    | 20 16   | 1512     | 11 0         |  |  |  |  |
| 0111       | Addressing | RESERVED | loop ID | RESERVED | Jump address |  |  |  |  |

Loop Start. Compares the loop iterator with the end value. If loop condition not met jump to the address. Forward jump only. Also computes the index value. The loop id must match between the start to end, and also indicates which control flow constants should be used with the loop.

|       | Loop_End                              |  |  |  |  |  |  |  |  |  |
|-------|---------------------------------------|--|--|--|--|--|--|--|--|--|
| 47 44 | 47 44 43 42 24 23 21 20 16 15 12 11 0 |  |  |  |  |  |  |  |  |  |
| 1000  |                                       |  |  |  |  |  |  |  |  |  |

Loop end. Increments the counter by one, compares the loop count with the end value. If loop condition met, continue, else, jump BACK to the start of the loop. If predicate break != 0, then compares predicate vector n (specified by predicate break number). If all bits cleared then break the loop.

The way this is described does not prevent nested loops, and the inclusion of the loop id make this easy to do.

|       | Conditionnal_Call               |  |  |  |  |  |  |  |  |  |
|-------|---------------------------------|--|--|--|--|--|--|--|--|--|
| 47 44 | 47 44 43 42 41 34 33 13 12 11 0 |  |  |  |  |  |  |  |  |  |
| 1001  | Jump address                    |  |  |  |  |  |  |  |  |  |

If the condition is met, jumps to the specified address and pushes the control flow program counter on the stack. If force call is set the condition is ignored and the call is made always.

|   |       |            | Return   |    |
|---|-------|------------|----------|----|
| - | 17 44 | 43         | 42 0     | 7  |
| - | 1010  | Addressing | RESERVED | 18 |

Pops the topmost address from the stack and jumps to that address. If nothing is on the stack, the program will just continue to the next instruction.

|       | Conditionnal_Jump |           |         |         |          |            |              |  |  |  |  |
|-------|-------------------|-----------|---------|---------|----------|------------|--------------|--|--|--|--|
| 47 44 |                   |           |         |         |          |            |              |  |  |  |  |
| 1011  | Addressing        | Condition | Boolean | FW only | RESERVED | Force Jump | Jump address |  |  |  |  |

If force jump is set the condition is ignored and the jump is made always. If FW only is set then only forward jumps are allowed.



EDIT DATE

4 September, 20159

DOCUMENT-REV. NUM. GEN-CXXXXX-REVA PAGE 25 of 53

|       |       |               | Allocate       |                     |
|-------|-------|---------------|----------------|---------------------|
| 47 44 | 43    | 4241          | 40 <u>3</u> -4 | 230                 |
| 1100  | Debug | Buffer Select | RESERVED       | SizeAllocation size |

Buffer Select takes a value of the following:

01 - position export (ordered export)

10 - parameter cache or pixel export (ordered export)

11 - pass thru (out of order exports).

Size field is only used to reserve space in the export buffer for pass thru exports. Valid values are 1 (1 line) thru 9 (9 lines). It should be determined by the compiler/assembler by taking max index used +1.

Buffer Size takes a value of the following:

00 - 1 buffer

01 - 2 buffers

15 - 16 buffers

If debug is set this is a debug alloc (ignore if debug DB\_ON register is set to off).

### 6.3 Implementation

The envisioned implementation has a buffer that maintains the state of each thread. A thread lives in a given location in the buffer during its entire life, but the buffer has FIFO qualities in that threads leave in the order that they enter. Actually two buffers are maintained -- one for Vertices and one for Pixels. The intended implementation would allow for:

16 entries for vertices

48 entries for pixels.

From each buffer, arbitration logic attempts to select 1 thread for the texture unit and 1 (interleaved) thread for the ALU unit. Once a thread is selected it is read out of the buffer, marked as invalid, and submitted to appropriate execution unit. It is returned to the buffer (at the same place) with its status updated once all possible sequential instructions have been executed. A switch from ALU to TEX or visa-versa or a Serialize\_Execution modifier forces the thread to be returned to the buffer.

Each entry in the buffer will be stored across two physical pieces of memory - most bits will be stored in a 1 read port device. Only bits needed for thread arbitration will be stored in a highly multi-ported structure. The bits kept in the 1 read port device will be termed 'state'. The bits kept in the multi-read ported device will be termed 'status'.

#### 'State Bits' needed include:

- 1. Control Flow Instruction Pointer (13 bits),
- 2. Execution Count Marker 4 bits),
- 3. Loop Iterators (4x9 bits),
- Call return pointers (4x12 bits),
- 5. Predicate Bits (64 bits),
- 6. Export ID (1 bit),
- 7. Parameter Cache base Ptr (7 bits),
- 8. GPR Base Ptr (8 bits),
- Context Ptr (3 bits).
- 10. LOD corrections (6x16 bits)
- 11. Valid bits (64 bits)
- 12. RT (1 bit) Signifies that this thread is a Real Time thread. This bit must be sent to the Constant store state machine when reading it.

Formatted: Bullets and Numbering



ORIGINATE DATE

24 September, 2001

EDIT DATE

R400 Sequencer Specification

PAGE 26 of 53

ember, 2001 <u>4 September, 20159</u>

Absent from this list are 'Index' pointers. These are costly enough that I'm presuming that they are instead stored in the GPRs. The first seven fields above (Control Flow Ptr, Execution Count, Loop Counts, call return ptrs, Predicate bits, PC base ptr and export ID) are updated every time the thread is returned to the buffer based on how much progress has been mode on thread execution. GPR Base Ptr, Context Ptr and LOD corrections are unchanged throughout execution of the thread.

'Status Bits' needed include:

- Valid Thread
- Texture/ALU engine needed
- · Texture Reads are outstanding
- · Waiting on Texture Read to Complete
- · Allocation Wait (2 bits)
- 00 No allocation needed
- 01 Position export allocation needed (ordered export)
- 10 Parameter or pixel export needed (ordered export)
- 11 pass thru (out of order export)
- · Allocation Size (4 bits)
- Position Allocated
- First thread of a new context
- Event thread (NULL thread that needs to trickle down the pipe)
- Last (1 bit)
- Pulse SX (1 bit)

All of the above fields from all of the entries go into the arbitration circuitry. The arbitration circuitry will select a winner for both the Texture Engine and for the ALU engine. There are actually two sets of arbitration -- one for pixels and one for vertices. A final selection is then done between the two. But the rest of this implementation summary only considers the 'first' level selection which is similar for both pixels and vertices.

Texture arbitration requires no allocation or ordering so it is purely based on selecting the 'oldest' thread that requires the Texture Engine.

ALU arbitration is a little more complicated. First, only threads where either of Texture\_Reads\_outstanding or Waiting\_on\_Texture\_Read\_to\_Complete are '0' are considered. Then if Allocation\_Wait is active, these threads are further filtered based on whether space is available. If the allocation is position allocation, then the thread is only considered if all 'older' threads have already done their position allocation (position allocated bits set). If the allocation is parameter or pixel allocation, then the thread is only considered if it is the oldest thread. Also a thread is not considered if it is a parameter or pixel or position allocation, has its First\_thread\_of\_a\_new\_context bit set and would cause ALU interleaving with another thread performing the same parameter or pixel or position allocation. Finally the 'oldest' of the threads that pass through the above filters is selected. If the thread needed to allocate, then at this time the allocation is done, based on Allocation\_Size. If a thread has its "last" bit set, then it is also removed from the buffer, never to return.

If I now redefine 'clauses' to mean 'how many times the thread is removed from the thread buffer for the purpose of exection by either the ALU or Texture engine', then the minimum number of clauses needed is 2 — one to perform the allocation for exports (execution automatically halts after an 'Alloc' instruction) (but doesn't performs the actual allocation) and one for the actual ALU/export instructions. As the 'Alloc' instruction could be part of a texture clause (presumably the final instruction in such a clause), a thread could still execute in this minimal number of 2 clauses, even if it involved texture fetching.

The Texture\_Reads\_Outstanding bit must be updated by the sequencer, based on keeping track of how many Texture Clauses have been executed by a given thread that have not yet had there data returned. Any number above 0 results in this bit being set. We could consider forcing synchronization such that two texture clauses for a given thread may not be outstanding at any time (that would be my preference for simplicity reasons and because it would require only very little change in the texture pipe interface). This would allow the sequencer to set the bit on execution of the texture clause, and allow the texture unit to return a pointer to the thread buffer on completion that clears the bit.



EDIT DATE 4 September, 20159 DOCUMENT-REV. NUM. GEN-CXXXXX-REVA PAGE 27 of 53

## 6.4 Data dependant predicate instructions

Data dependant conditionals will be supported in the R400. The only way we plan to support those is by supporting three vector/scalar predicate operations of the form:

PRED\_SETE\_# - similar to SETE except that the result is 'exported' to the sequencer.

PRED\_SETNE\_# - similar to SETNE except that the result is 'exported' to the sequencer.

PRED\_SETGT\_# - similar to SETGT except that the result is 'exported' to the sequencer

PRED\_SETGTE\_# - similar to SETGTE except that the result is 'exported' to the sequencer

For the scalar operations only we will also support the two following instructions:

PRED\_SETÉ0\_# - SETE0 PRED\_SETE1\_# - SETE1

The export is a single bit - 1 or 0 that is sent using the same data path as the MOVA instruction. The sequencer will maintain 4 sets of 64 bit predicate vectors (in fact 8 sets because we interleave two programs but only 4 will be exposed) and use it to control the write masking. This predicate is not maintained across clause boundaries. The # sign is used to specify which predicate set you want to use 0 thru 3.

Then we have two conditional execute bits. The first bit is a conditional execute "on" bit and the second bit tells us if we execute on 1 or 0. For example, the instruction:

P0\_ADD\_# R0,R1,R2

Is only going to write the result of the ADD into those GPRs whose predicate bit is 0. Alternatively, P1\_ADD\_# would only write the results to the GPRs whose predicate bit is set. The use of the P0 or P1 without precharging the sequencer with a PRED instruction is undefined.

{Issue: do we have to have a NOP between PRED and the first instruction that uses a predicate?}

### 6.5 HW Detection of PV,PS

Because of the control program, the compiler cannot detect statically dependant instructions. In the case of non-masked writes and subsequent reads the sequencer will insert uses of PV,PS as needed. This will be done by comparing the read address and the write address of consecutive instructions. For masked writes, the sequencer will insert NOPs wherever there is a dependant read/write.

The sequencer will also have to insert NOPs between PRED\_SET and MOVA instructions and their uses.

## 6.6 Register file indexing

Because we can have loops in fetch clause, we need to be able to index into the register file in order to retrieve the data created in a fetch clause loop and use it into an ALU clause. The instruction will include the base address for register indexing and the instruction will contain these controls:

| Bit7 | Bit 6 |                     |
|------|-------|---------------------|
| 0    | 0     | 'absolute register  |
| 0    | 1     | 'relative register' |
| 1    | 0     | 'previous vector'   |
| 1    | 4     | 'nrovious coolor'   |

In the case of an absolute register we just take the address as is. In the case of a relative register read we take the base address and we add to it the loop\_index and this becomes our new address that we give to the shader pipe.

The sequencer is going to keep a loop index computed as such:

Index = Loop\_iterator\*Loop\_step + Loop\_start.

We loop until loop\_iterator = loop\_count. Loop\_step is a signed value [-128...127]. The computed index value is a 10 bit counter that is also signed. Its real range is [-256,256]. The tenth bit is only there so that we can provide an out of



ORIGINATE DATE

EDIT DATE

R400 Sequencer Specification

PAGE 28 of 53

24 September, 2001 <u>4 September, 20159</u>

range value to the "indexing logic" so that it knows when the provided index is out of range and thus can make the necessary arrangements.

# 6.7 Debugging the Shaders

In order to be able to debug the pixel/vertex shaders efficiently, we provide 2 methods.

#### 6.7.1 Method 1: Debugging registers

Current plans are to expose 2 debugging, or error notification, registers:

- 1. address register where the first error occurred
- 2. count of the number of errors

The sequencer will detect the following groups of errors:

- count overflow
- constant indexing overflow
- register indexing overflow

Compiler recognizable errors:

- jump errors
  - relative jump address > size of the control flow program
- call stack
  - call with stack full return with stack empty

allowing further clauses to be executed.

A jump error will always cause the program to break. In this case, a break means that a clause will halt execution, but

With all the other errors, program can continue to run, potentially to worst-case limits. The program will only break if the DB\_PROB\_BREAK register is set.

If indexing outside of the constant or the register range, causing an overflow error, the hardware is specified to return the value with an index of 0. This could be exploited to generate error tokens, by reserving and initializing the 0th register (or constant) for errors.

{ISSUE : Interrupt to the driver or not?}

#### 6.7.2 Method 2: Exporting the values in the GPRs

1) The sequencer will have a debug active, count register and an address register for this mode.

Under the normal mode execution follows the normal course.

Under the debug mode it is assumed that the program is always exporting n debug vectors and that all other exports to the SX block (position, color, z, ect) will been turned off (changed into NOPs) by the sequencer (even if they occur before the address stated by the ADDR debug register).

#### Pixel Kill Mask

A vector of 64 bits is kept by the sequencer per group of pixels/vertices. Its purpose is to optimize the texture fetch requests and allow the shader pipe to kill pixels using the following instructions:

MASK\_SETE MASK\_SETNE MASK\_SETGT MASK\_SETGTE



EDIT DATE
4 September, 20159

DOCUMENT-REV. NUM. GEN-CXXXXX-REVA PAGE 29 of 53

## 8. Multipass vertex shaders (HOS)

Multipass vertex shaders are able to export from the 6 last clauses but to memory ONLY.

# 9. Register file allocation

The register file allocation for vertices and pixels can either be static or dynamic. In both cases, the register file in managed using two round robins (one for pixels and one for vertices). In the dynamic case the boundary between pixels and vertices is allowed to move, in the static case it is fixed to 128-VERTEX\_REG\_SIZE for vertices and PIXEL\_REG\_SIZE for pixels.



Above is an example of how the algorithm works. Vertices come in from top to bottom; pixels come in from bottom to top. Vertices are in orange and pixels in green. The blue line is the tail of the vertices and the green line is the tail of the pixels. Thus anything between the two lines is shared. When pixels meets vertices the line turns white and the boundary is static until both vertices and pixels share the same "unallocated bubble". Then the boundary is allowed to move again. The numbering of the GPRs starts from the bottom of the picture at index 0 and goes up to the top at index 127.

## 10. Fetch Arbitration

The fetch arbitration logic chooses one of the n potentially pending fetch clauses to be executed. The choice is made by looking at the Vs and Ps reservation stations and picking the first one ready to execute. Once chosen, the clause state machine will send one 2x2 fetch per clock (or 4 fetches in one clock every 4 clocks) until all the fetch instructions of the clause are sent. This means that there cannot be any dependencies between two fetches of the same clause.

The arbitrator will not wait for the fetches to return prior to selecting another clause for execution. The fetch pipe will be able to handle up to X(?) in flight fetches and thus there can be a fair number of active clauses waiting for their fetch return data.

#### 11. ALU Arbitration

ALU arbitration proceeds in almost the same way than fetch arbitration. The ALU arbitration logic chooses one of the n potentially pending ALU clauses to be executed. The choice is made by looking at the Vs and Ps reservation stations and picking the first one ready to execute. There are two ALU arbiters, one for the even clocks and one for the odd clocks. For example, here is the sequencing of two interleaved ALU clauses (E and O stands for Even and Odd sets of 4 clocks):

Einst0 Oinst0 Einst1 Oinst1 Einst2 Oinst2 Einst0 Oinst3 Einst1 Oinst4 Einst2 Oinst0...



EDIT DATE 4 September, 20159 DOCUMENT-REV. NUM. GEN-CXXXXX-REVA PAGE 31 of 53

Proceeding this way hides the latency of 8 clocks of the ALUs. Also note that the interleaving also occurs across clause boundaries.

## 12. Handling Stalls

When the output file is full, the sequencer prevents the ALU arbitration logic from selecting the last clause (this way nothing can exit the shader pipe until there is place in the output file. If the packet is a vertex packet and the position buffer is full (POS\_FULL) then the sequencer also prevents a thread from entering an exporting clause. The sequencer will set the OUT\_FILE\_FULL signal n clocks before the output file is actually full and thus the ALU arbiter will be able read this signal and act accordingly by not preventing exporting clauses to proceed.

## 13. Content of the reservation station FIFOs

The reservation FIFOs contain the state of the vector of pixels and vertices. We have two sets of those: one for pixels, and one for vertices. They contain 3 bits of Render State 7 bits for the base address of the GPRs, some bits for LOD correction and coverage mask information in order to fetch for only valid pixels, the quad address.

## 14. The Output File

The output file is where pixels are put before they go to the RBs. The write BW to this store is 256 bits/clock. Just before this output file are staging registers with write BW 512 bits/clock and read BW 256 bits/clock. The staging registers are 4x128 (and there are 16 of those on the whole chip).

## 15. IJ Format

The IJ information sent by the PA is of this format on a per quad basis:

We have a vector of IJ's (one IJ per pixel at the centroid of the fragment or at the center of the pixel depending on the mode bit). All pixel's parameters are always interpolated at full 20x24 mantissa precision.

$$P0 = A + I(0) * (B - A) + J(0) * (C - A)$$

$$P1 = A + I(1) * (B - A) + J(1) * (C - A)$$

$$P2 = A + I(2) * (B - A) + J(2) * (C - A)$$

$$P3 = A + I(3) * (B - A) + J(3) * (C - A)$$



Multiplies (Full Precision): 8 Subtracts 19x24 (Parameters): 2 Adds: 8

FORMAT OF P's IJ: Mantissa 20 Exp 4 for I + Sign Mantissa 20 Exp 4 for J + Sign

Total number of bits: 20\*8 + 4\*8 + 4\*2 = 200.

All numbers are kept using the un-normalized floating point convention: if exponent is different than 0 the number is normalized if not, then the number is un-normalized. The maximum range for the IJs (Full precision) is +/- 1024.

## 15.1 Interpolation of constant attributes

Because of the floating point imprecision, we need to take special provisions if all the interpolated terms are the same or if two of the terms are the same.



EDIT DATE 4 September, 20159 R400 Sequencer Specification

PAGE 32 of 53

# 16. Staging Registers

In order for the reuse of the vertices to be 14, the sequencer will have to re-order the data sent IN ORDER by the VGT for it to be aligned with the parameter cache memory arrangement. Given the following group of vertices sent by the VGT:

 $0\ 1\ 2\ 3\ 4\ 5\ 6\ 7\ 8\ 9\ 10\ 11\ 12\ 13\ 14\ 15\ ||\ 16\ 17\ 18\ 19\ 20\ 21\ 22\ 23\ 24\ 25\ 26\ 27\ 28\ 29\ 30\ 31\ ||\ 32\ 33\ 34\ 35\ 36\ 37\ 38\ 39\ 40\ 41\ 42\ 43\ 44\ 45\ 46\ 47\ ||\ 48\ 49\ 50\ 51\ 52\ 53\ 54\ 55\ 56\ 57\ 58\ 59\ 60\ 61\ 62\ 63$ 

The sequencer will re-arrange them in this fashion:

0 1 2 3 16 17 18 19 32 33 34 35 48 49 50 51 || 4 5 6 7 20 21 22 23 36 37 38 39 52 53 54 55 || 8 9 10 11 24 25 26 27 40 41 42 43 56 57 58 59 || 12 13 14 15 28 29 30 31 44 45 46 47 60 61 62 63

The || markers show the SP divisions. In the event a shader pipe is broken, the SQ is responsible to insert padding to account for the missing pipe. For example, if SP1 is broken, vertices 4 5 6 7 20 21 22 23 36 37 38 39 52 53 54 55 will not be sent by the VGT to the SQ AND the SQ is responsible to "jump" over these vertices in order for no valid vertices to be sent to an invalid SP.

The most straightforward, non-compressed interface method would be to convert, in the VGT, the data to 32-bit floating point prior to transmission to the VSISRs. In this scenario, the data would be transmitted to (and stored in) the VSISRs in full 32-bit floating point. This method requires three 24-bit fixed-to-float converters in the VGT. Unfortunately, it also requires and additional 3,072 bits of storage across the VSISRs. This interface is illustrated in Figure 11Figure—11. The area of the fixed-to-float converters and the VSISRs for this method is roughly estimated as 0.759sqmm using the R300 process. The gate count estimate is shown in Figure 10Figure 10Figure 40.

| Basis for 8-deep Latch Memory (from                                     | n R300)       | *************************************** | · · · · · · · · · · · · · · · · · · · |
|-------------------------------------------------------------------------|---------------|-----------------------------------------|---------------------------------------|
| 8x24-bit                                                                | 11631         | $\mu^2$                                 | $60.57813\mu^2\text{per bit}$         |
| Area of 96x8-deep Latch Memory<br>Area of 24-bit Fix-to-float Converter | 46524<br>4712 | $\mu^2$ $\mu^2$ per conve               | erter                                 |
| Method 1                                                                | Block         | Quantity                                | Area                                  |
|                                                                         | F2F           | 3                                       | 14136                                 |
|                                                                         | 8x96 Latch    | 16_                                     | 744384                                |
|                                                                         |               |                                         | 758520 μ²                             |

Figure 10:Area Estimate for VGT to Shader Interface



EDIT DATE
4 September, 20159

DOCUMENT-REV. NUM. GEN-CXXXXX-REVA PAGE 33 of 53



Figure 11:VGT to Shader Interface

## 17. The parameter cache

The parameter cache is where the vertex shaders export their data. It consists of 16 128x128 memories (1R/1W). The reuse engine will make it so that all vertexes of a given primitive will hit different memories. The allocation method for these memories is a simple round robin. The parameter cache pointers are mapped in the following way: 4MSBs are the memory number and the 7 LSBs are the address within this memory.

| MEMORY NUMBER | ADDRESS |
|---------------|---------|
| 4 bits        | 7 bits  |

The PA generates the parameter cache addresses as the positions come from the SQ. All it needs to do is keep a Current\_Location pointer (7 bits only) and as the positions comes increment the memory number. When the memory number field wraps around, the PA increments the Current\_Location by VS\_EXPORT\_COUNT (a snooped register from the SQ). As an example, say the memories are all empty to begin with and the vertex shader is exporting 8 parameters per vertex (VS\_EXPORT\_COUNT = 8). The first position received is going to have the PC address 00000000000 the second one 00010000000, third one 0010000000 and so on up to 11110000000. Then the next position received (the 17<sup>th</sup>) is going to have the address 0000001000, the 18<sup>th</sup> 00010001000, the 19<sup>th</sup> 00100001000 and so on. The Current\_location is NEVER reset BUT on chip resets. The only thing to be careful about is that if the SX doesn't send you a full group of positions (<64) then you need to fill the address space so that the next group starts correctly aligned (for example if you receive only 33 positions then you need to add 2\*VS\_EXPORT\_COUNT to Current\_Location and reset the memory count to 0 before the next vector begins).



EDIT DATE
4 September, 20159

R400 Sequencer Specification

PAGE 34 of 53

### 17.1 Export restrictions

#### 17.1.1 Pixel exports:

Pixels can export 1,2,3 or 4 color buffers to the SX( +z). The exports will be done in order. The PRED\_OPTIMIZE function has to be turned of if the exports are done using interleaved predicated instructions. The exports will always be ordered to the SX.

#### 17.1.2 Vertex exports:

Position or parameter caches can be exported in any order in the shader program. It is always better to export posistion as soon as possible. Position has to be exported in a single export block (no texture instructions can be placed between the exports). Parameter cache exports can be done in any order with texture instructions interleaved. The PRED\_OPTIMIZE function has to be turned of if the exports are done using interleaved predicated instructions to the Parameter cache (see Arbitration restrictions for details). The exports will always be allocated in order to the SX.

#### 17.1.3 Pass thru exports:

Pass thru exports have to be done in groups of the form:

```
Alloc 4 (8 or 12)
Execute ALU(ADDR) ALU(DATA) ALU(DATA) ALU(DATA)...
```

They cannot have texture instructions interleaved in the export block. These exports are not guaranteed to be ordered.

Also, when doing a pass thru export, Position MUST be exported AFTER all pass thru exports. This position export is used to synchronize the chip when doing a transition from pass thru shader to regular shader and vice versa.

#### 17.2 Arbitration restrictions

Here are the Sequencer arbitration restrictions:

- 1) Cannot execute a serialized thread if the corresponding texture pending bit is set
- 2) Cannot allocate position if any older thread has not allocated position
- 3) If last thread is marked as not valid AND marked as last and we are about to execute the second to oldest thread also marked last then:
  - a. Both threads must be from the same context (cannot allow a first thread)
  - b. Must turn off the predicate optimization for the second thread
- 4) Cannot execute a texture clause if texture reads are pending
- 5) Cannot execute last if texture pending (even if not serial)

# 18. Export Types

The export type (or the location where the data should be put) is specified using the destination address field in the ALU instruction. Here is a list of all possible export modes:

## 18.1 Vertex Shading

0:15 - 16 parameter cache 16:31 - Empty (Reserved?)

32 - Export Address

33:40 41 - 8-9 vertex exports to the frame buffer and index

4142:47 - Empty

48:55 - 8 debug export (interpret as normal vertex export)

60 - export addressing mode

61 - Empty

62 - position



ORIGINATE DATE

EDIT DATE

DOCUMENT-REV. NUM. GEN-CXXXXX-REVA

PAGE 35 of 53

24 September, 2001

4 September, 20159

- sprite size export that goes with position export

(point\_h,point\_w,edgeflag,mise)X= point size, Y= edge flag is bit 0, Z= VtxKill is bitwise OR

of bits 30:0. Any bit other than sign means VtxKill.)

## 18.2 Pixel Shading

```
- Color for buffer 0 (primary)
        - Color for buffer 1
        - Color for buffer 2
        - Color for buffer 3
4:715
       - Empty
       - Buffer 0 Color/Fog (primary)
816
       - Buffer 1 Color/Fog
108
       - Buffer 2 Color/Fog
119
       - Buffer 3 Color/Fog
1220:1531
               - Empty
16:31 - Empty (Reserved?)
       - Export Address
33:4041
               - 8-9 exports for multipass pixel shaders.
412:47 - Empty
48:55 - 8 debug exports (interpret as normal pixel export)
              -60---
                       - export addressing mode
     - Z for primary buffer (Z exported to 'alpha' component)
6162:623
               - Empty
        - Z for primary buffer (Z exported to 'alpha' component)
```

Formatted: Bullets and Numbering

## 19. Special Interpolation modes

### 19.1 Real time commands

We are unable to use the parameter memory since there is no way for a command stream to write into it. Instead we need to add three 16x128 memories (one for each of three vertices x 16 interpolants). These will be mapped onto the register bus and written by type 0 packets, and output to the the parameter busses (the sequencer and/or PA need to be able to address the reatime parameter memory as well as the regular parameter store. For higher performance we should be able able to view them as two banks of 16 and do double buffering allowing one to be loaded, while the other is rasterized with. Most overlay shaders will need 2 or 4 scalar coordinates, one option might be to restrict the memory to 16x64 or 32x64 allowing only two interpolated scalars per cycle, the only problem I see with this is, if we view support for 16 vector-4 interpolants important (true only if we map Microsoft's high priority stream to the realtime stream), then the PA/sequencer need to support a realtime-specific mode where we need to address 32 vectors of parameters instead of 16. This mode is triggered by the primitive type: REAL TIME. The actual memories are in the in the SX blocks. The parameter data memories are hooked on the RBBM bus and are loaded by the CP using register mapped memory.

## 19.2 Sprites/ XY screen coordinates/ FB information

XY screen coordinates may be needed in the shader program. This functionality is controlled by the param\_gen\_10 register (in SQ) in conjunction with the SND\_XY register (in SC) and the param\_gen\_pos. Also it is possible to send the faceness information (for OGL front/back special operations) to the shader using the same control register. Here is a list of all the modes and how they interact together:

The Data is going to be written in the register specified by the param\_gen\_pos register.

```
Param_Gen_I0 disable, snd_xy disable = No modification
Param_Gen_I0 disable, snd_xy enable = No modification
Param_Gen_I0 enable, snd_xy disable = Sign(faceness)garbage,(Sign Point)garbage,Sign(Line)s, t
Param_Gen_I0 enable, snd_xy enable = Sign(faceness)screenX,(Sign Point)screenY,Sign(Line)s, t
```

In other words,



}

ORIGINATE DATE

EDIT DATE

R400 Sequencer Specification

PAGE 36 of 53

24 September, 2001

4 September, 20159

The generated vector is (X in RED, Y in GREEN, S in BLUE and T in ALPHA):

X,Y,S,T

These values are always supposed to be positive and any shader use of them should use the ABS function (as their sign bits will now be used for flags).

SignX = BackFacing

SignY = Point Primitive

SignS = Line Primitive

SignT = currently unused as a flag.

If !Point & !Line, then it is a Poly.

I would assume that one implementation which allows for generic texture lookup (using 3D maps) for poly stipple and AA for the driver would be

## 19.3 Auto generated counters

In the cases we are dealing with multipass shaders, the sequencer is going to generate a vector count to be able to both use this count to write the 1st pass data to memory and then use the count to retrieve the data on the 2nd pass. The count is always generated in the same way but it is passed to the shader in a slightly different way depending on the shader type (pixel or vertex). This is toggled on and off using the GEN\_INDEX\_PIX/VTX register. The sequencer is going to keep two counters, one for pixels and one for vertices. Every time a full vector of vertices or pixels is written to the GPRs the counter is incremented. Every time a state\_change\_is\_detected\_RST\_PIX\_COUNT\_or RST\_VTX\_COUNT\_events are received, the corresponding counter is reset. While there is only one count broadcast to the GPRs, the LSB are hardwired to specific values making the index different for all elements in the vector. Since the count must be different for all pixels/vertices and the 4 LSBs (16 positions) are hardwired to the corresponding shader unit the SQ has two choices:

1) Maintain a 19 bit counter that counts the vectors of 64. In this case the phase must be appended to the count before the count is broadcast to the SPs:

Counter (19 bits)

Phase (2 bits)

Hardwired (4 bits)

2) Maintain a 21 bits counter that counts sub-vectors of 16. In this case only the counter is sent to the Sps:

Counter (21 bits)

Harwired (4 bits)

#### 19.3.1 Vertex shaders

In the case of vertex shaders, if GEN\_INDEX\_VTX is set, the data will be put into the x field of the third register (it means that the compiler must allocate 3 GPRs in all multipass vertex shader modes).

#### 19.3.2 Pixel shaders

In the case of pixel shaders, if GEN\_INDEX\_PIX is set, the data will be put in the x field of the param\_gen\_pos+1 register.

Exhibit 2033 docR400\_Sequencer.doc 73016 Bytes\*\*\* © ATI Confidential. Reference Copyright Notice on Cover Page © \*\*\*

Formatted: Bullets and Numbering

Formatted: Bullets and Numbering



Figure 12: GPR input mux Control

## 20. State management

Every clock, the sequencer will report to the CP the oldest states still in the pipe. These are the states of the programs as they enter the last ALU clause.

## 20.1 Parameter cache synchronization

In order for the sequencer not to begin a group of pixels before the associated group of vertices has finished, the sequencer will keep a 6 bit count per state (for a total of 8 counters). These counters are initialized to 0 and every time a vertex shader exports its data TO THE PARAMETER CACHE, the corresponding pointer is incremented. When the SC sends a new vector of pixels with the SC\_SQ\_new\_vector bit asserted, the sequencer will first check if the count is greater than 0 before accepting the transmission (it will in fact accept the transmission but then lower its ready to receive). Then the sequencer waits for the count to go to one and decrements it. The sequencer can then issue the group of pixels to the interpolators. Every time the state changes, the new state counter is initialized to 0.

### 21. XY Address imports

The SC will be able to send the XY addresses to the GPRs. It does so by interleaving the writes of the IJs (to the IJ buffer) with XY writes (to the XY buffer). Then when writing the data to the GPRs, the sequencer is going to interpolate the IJ data or pass the XY data thru a Fix→float converter and expander and write the converted values to the GPRs. The Xys are currently SCREEN SPACE COORDINATES. The values in the XY buffers will wrap. See section 19.2 for details on how to control the interpolation in this mode.

### 21.1 Vertex indexes imports

In order to import vertex indexes, we have 16 8x96 staging registers. These are loaded one line at a time by the VGT block (96 bits). They are loaded in floating point format and can be transferred in 4 or 8 clocks to the GPRs.

#### 22. Registers

Please see the auto-generated web pages for register definitions.



EDIT DATE
4 September, 20159

R400 Sequencer Specification

PAGE 38 of 53

23. Interfaces

#### 23.1 External Interfaces

Whenever an x is used, it means that the bus is broadcast to all units of the same name. For example, if a bus is named  $SQ \rightarrow SPx$  it means that SQ is going to broadcast the same information to all SP instances.

## 23.2 SC to SP Interfaces

## 23.2.1 SC\_SP#

There is one of these interfaces at front of each of the SP (buffer to stage pixel interpolators). This interface transmits the I,J data for pixel interpolation. For the entire system, two quads per clock are transferred to the 4 SPs, so each of these 4 interfaces transmits one half of a quad per clock. The interface below describes a half of a quad worth of data.

The actual data which is transferred per quad is Ref Pix I => S4.20 Floating Point I value \*4 Ref Pix J => S4.20 Floating Point J value \*4

This equates to a total of 200 bits which transferred over 2 clocks and therefor needs an interface 100 bits wide

Additionally, X,Y data (12-bit unsigned fixed) is conditionally sent across this data bus over the same wires in an additional clock. The X,Y data is sent on the lower 24 bits of the data bus with faceness in the msb. Transfers across these interfaces are synchronized with the SC\_SQ IJ Control Bus transfers.

The data transfer across each of these busses is controlled by a IJ\_BUF\_INUSE\_COUNT in the SC. Each time the SC has sent a pixel vector's worth of data to the SPs, he will increment the IJ\_BUF\_INUSE\_COUNT count. Prior to sending the next pixel vectors data, he will check to make sure the count is less than MAX\_BUFER\_MINUS\_2, if not the SC will stall until the SQ returns a pipelined pulse to decrement the count when he has scheduled a buffer free. Note: We could/may optimize for the case of only sending only IJ to use all the buffers to pre-load more. Currently it is planned for the SP to hold 2 double buffers of I,J data and two buffers of X,Y data, so if either X,Y or Centers and Centroids are on, then the SC can send two Buffers.

In at least the initial version, the SC shall send 16 quads per pixel vector even if the vector is not full. This will increment buffer write address pointers correctly all the time. (We may revisit this for both the SX,SP,SQ and add a EndOF/ector signal on all interfaces to quit early. We opted for the simple mode first with a belief that only the end of packet and multiple new vector signals should cause a partial vector and that this would not really be significant performance hit.)

| Name                  | Bits | Description                                                                                                                                                                                                                                                                                                                        |  |  |  |  |  |
|-----------------------|------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--|--|--|--|--|
| SC_SP#_data           | 100  | IJ information sent over 2 clocks (or X,Y in 24 LSBs with faceness in upper bit)   Type 0 or 1, First clock I, second clk J     Field ULC URC LLC LRC     Bits [63:39] [38:26] [25:13] [12:0]     Format SE4M20 SE4M20 SE4M20 SE4M20     Type 2     Field Face X Y     Bits [6324] [23:12] [11:0]     Format Bit Unsigned Unsigned |  |  |  |  |  |
| SC_SP#_valid          | 1    | Valid                                                                                                                                                                                                                                                                                                                              |  |  |  |  |  |
| SC_SP#_last_quad_data | 1    | This bit will be set on the last transfer of data per quad.                                                                                                                                                                                                                                                                        |  |  |  |  |  |
| SC_SP#_type           | 2    | 0 -> Indicates centroids                                                                                                                                                                                                                                                                                                           |  |  |  |  |  |
|                       |      | 1 -> Indicates centers                                                                                                                                                                                                                                                                                                             |  |  |  |  |  |
|                       |      | 2 -> Indicates X,Y Data and faceness on data bus                                                                                                                                                                                                                                                                                   |  |  |  |  |  |
|                       |      | The SC shall look at state data to determine how many types to send for the                                                                                                                                                                                                                                                        |  |  |  |  |  |



EDIT DATE
4 September, 20159

DOCUMENT-REV. NUM. GEN-CXXXXX-REVA

PAGE 39 of 53

interpolation process.

The # is included for clarity in the spec and will be replaced with a prefix of u#\_ in the verilog module statement for the SC and the SP block will have neither because the instantiation will insert the prefix.

# 23.2.2 SC\_SQ

This is the control information sent to the sequencer in order to synchronize and control the interpolation and/or loading data into the GPRs needed to execute a shader program on the sent pixels. This data will be sent over two clocks per transfer with 1 to 16 transfers. Therefore the bus (approx 94-108 bits) could be folded in half to approx 49 54 bits.

| Name        | Bits | Description                                                            |
|-------------|------|------------------------------------------------------------------------|
| SC_SQ_data  | 46   | Control Data sent to the SQ                                            |
|             |      | 1 clk transfers                                                        |
|             |      | Event – valid data consist of event_id and                             |
|             |      | state_id. Instruct SQ to post an                                       |
|             |      | event vector to send state id and                                      |
|             |      | event_id through request fifo                                          |
|             |      | and onto the reservation stations                                      |
|             |      | making sure state id and/or event_id                                   |
|             |      | gets back to the CP. Events only                                       |
|             |      | follow end of packets so no pixel                                      |
|             |      | vectors will be in progress.                                           |
|             |      | Empty Quad Mask – Transfer Control data                                |
|             |      | consisting of pc_dealloc                                               |
|             |      | or new_vector. Receipt of this is to                                   |
|             |      | transfer pc_dealloc or new_vector                                      |
|             |      | without any valid quad data. New                                       |
|             |      | vector will always be posted to                                        |
|             |      | request fifo and pc_dealloc will be                                    |
|             |      | attached to any pixel vector                                           |
|             |      | outstanding or posted in request fifo<br>if no valid quad outstanding. |
|             |      | 2 clk transfers                                                        |
|             |      | Quad Data Valid – Sending quad data with or                            |
|             |      | without new_vector or pc_dealloc.                                      |
|             |      | New vector will be posted to request                                   |
|             |      | fifo with or without a pixel vector and                                |
|             |      | pc_dealloc will be posted with a pixel                                 |
|             |      | vector unless none is in progress. In                                  |
|             |      | this case the pc_dealloc will be                                       |
|             |      | posted in the request queue.                                           |
|             |      | Filler quads will be transferred with                                  |
|             |      | The Quad mask set but the pixel                                        |
|             |      | corresponding pixel mask set to                                        |
|             |      | zero.                                                                  |
| SC_SQ_valid | 1    | SC sending valid data, 2 <sup>nd</sup> clk could be all zeroes         |

SC\_SQ\_data - first clock and second clock transfers are shown in the table below.

| Name                           | BitField | Bits | Description                                                                       |
|--------------------------------|----------|------|-----------------------------------------------------------------------------------|
| 1 <sup>st</sup> Clock Transfer |          |      |                                                                                   |
| SC_SQ_event                    | 0        | 1    | This transfer is a 1 clock event vector Force quad_mask = new_vector=pc_dealloc=0 |
| SC_SQ_event_id                 | [4:1]    | 4    | This field identifies the event 0 => denotes an End Of State Event 1              |

| MINNE DESCRIPTION OF THE PROPERTY OF THE PROPE |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |                                   |        |                                                               |                                                                                                                           | r            |  |  |  |
|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------|--------|---------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------|--------------|--|--|--|
|                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | ORIGINATE DATE 24 September, 2001                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |                                   | Е      | DIT DATE                                                      | R400 Sequencer Specification                                                                                              | PAGE         |  |  |  |
| 6000                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |                                   |        | tember, 20159                                                 |                                                                                                                           | 40 of 53     |  |  |  |
|                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | T                                 | T CARE | => TBD                                                        |                                                                                                                           |              |  |  |  |
| SC SQ pc de                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    | eallocSC SQ s                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 | [7:5][7:                          | 33     | Deallocation tol                                              | Deallocation token for the Parameter CacheState/consta                                                                    |              |  |  |  |
| tate id                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | 5]                                |        | (6*3+3)                                                       |                                                                                                                           |              |  |  |  |
| SC SQ pc de                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    | SC SQ pc dealloc                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |                                   | 3      | Deallocation tol                                              | Deallocation token for the Parameter Cache                                                                                |              |  |  |  |
| SC_SQ_new_vector                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | <u>11</u> 8                       | 1      | ,                                                             | The SQ must wait for Vertex shader done count > 0 and after dispatching the Pixel Vector the SQ will decrement the count. |              |  |  |  |
| SC_SQ_quad                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | _mask                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | [12 <u>5</u> : <u>12</u><br>9]    | 4      | Quad Write ma                                                 | sk left to right SP0 => SP3                                                                                               |              |  |  |  |
| SC_SQ_end_e                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    | of prim                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       | 136                               | 1      | End Of the prim                                               | itive                                                                                                                     |              |  |  |  |
| SC SQ pix m                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | [32:17]                           | 16     |                                                               | pixels SP0=>SP3 (UL,UR,LL,LR)                                                                                             |              |  |  |  |
| SC_SQ_provo                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    | SC_SQ_provok_vtx [374                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |                                   |        | Provoking vertex for flat shading                             |                                                                                                                           |              |  |  |  |
| SC SQ pc pt                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    | #OSC SQ lod                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   | [483:38                           | 119    | Parameter Cache pointer for vertex 0LOD correction for quad 0 |                                                                                                                           |              |  |  |  |
| correct 0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      | WALLAND REPORT OF THE PARTY OF | 5]                                | _      | (SP0) (9 bits per quad)                                       |                                                                                                                           |              |  |  |  |
| SC SQ lod c                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    | orrect_1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      | [52:44]                           | 9      | LOD correction for quad 1 (SP1) (9 bits per quad)             |                                                                                                                           |              |  |  |  |
|                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |                                   |        |                                                               |                                                                                                                           |              |  |  |  |
| 2nd Clock Tra                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  | nsfer                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |                                   |        |                                                               |                                                                                                                           |              |  |  |  |
| SC SQ lod c                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    | orrect_2SC_S                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  | [8:0][10                          | 911    | LOD correction                                                | for quad 2 (SP2) (9 bits per quad)Par                                                                                     | ameter Cache |  |  |  |
| Q_pc_ptr1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | :0]                               |        | pointer for verte                                             | x 1                                                                                                                       |              |  |  |  |
| SC SQ lod c                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    | orrect 3                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      | [17:9]                            | 9      | LOD correction for quad 3 (SP3) (9 bits per quad)             |                                                                                                                           |              |  |  |  |
| SC SQ pc pt                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    | trO                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           | [28:18]                           | 11     | Parameter Cache pointer for vertex 0                          |                                                                                                                           |              |  |  |  |
| SC_SQ_pc_pt                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    | tr2 <u>1</u>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  | [21 <u>39</u> :1<br>1 <u>29</u> ] | 11     | Parameter Cac                                                 | he pointer for vertex 12                                                                                                  |              |  |  |  |
| SC SQ pc pt                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    | tr2                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           | [4550:2                           | 241    | Parameter Cac                                                 | ne pointer for vertex 2LOD correction                                                                                     | per quad (6  |  |  |  |
| SC_SQ_lod_c                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    | orrect                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        | 240]                              | 1      | bits per quad)                                                |                                                                                                                           |              |  |  |  |
| SC_SQ_prim_                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    | type                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | [48 <u>53</u> :4                  | 3      |                                                               | d Real time command need to load te                                                                                       | cords from   |  |  |  |
|                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | 6 <u>51</u> ]                     |        | alternate buffer                                              |                                                                                                                           |              |  |  |  |
|                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |                                   |        | 000: Sprite (poi                                              | nt)                                                                                                                       |              |  |  |  |
|                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |                                   |        | 001: Line                                                     |                                                                                                                           |              |  |  |  |
|                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |                                   |        | 010: Tri_rect                                                 | arita (naint)                                                                                                             |              |  |  |  |
|                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |                                   |        | 100: Realtime S                                               |                                                                                                                           |              |  |  |  |
|                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |                                   |        | 101: Realtime L                                               |                                                                                                                           |              |  |  |  |
|                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | L                                 | J      | 110. Realliffle I                                             | II_ICU                                                                                                                    |              |  |  |  |

| Name               | Bits | Description                                                                   |
|--------------------|------|-------------------------------------------------------------------------------|
| SQ_SC_free_buff    | 1    | Pipelined bit that instructs SC to decrement count of buffers in use.         |
| SQ_SC_dec_cntr_cnt | 1    | Pipelined bit that instructs SC to decrement count of new vector and/or event |
|                    |      | sent to prevent SC from overflowing SQ interpolator/Reservation request fifo  |

The scan converter will submit a partial vector whenever:

- 1.) He gets a primitive marked with an end of packet signal.
- 2.) A current pixel vector is being assembled with at least one or more valid quads and the vector has been marked for deallocate when a primitive marked new\_vector arrives. The Scan Converter will submit a partial vector (up to 16quads with zero pixel mask to fill out the vector) prior to submitting the new\_vector marker\primitive.

(This will prevent a hang which can be demonstrated when all primitives in a packet three vectors are culled except for a one quad primitive that gets marked pc\_dealloc (vertices maximum size). In this case two new\_vectors are submitted and processed, but then one valid quad with the pc\_dealloc creates a vector and then the new would wait for another vertex vector to be processed, but the one being waited for could never export until the pc\_dealloc signal made it through and thus the hang.)



EDIT DATE
4 September, 20159

DOCUMENT-REV. NUM. GEN-CXXXXX-REVA PAGE 41 of 53

### 23.2.3 SQ to SX(SP): Interpolator bus

| Name                       | Direction     | Bits | Description                                             |
|----------------------------|---------------|------|---------------------------------------------------------|
| SQ_SPx_interp_flat_vtx     | SQ→SPx        | 2    | Provoking vertex for flat shading                       |
| SQ_SPx_interp_flat_gouraud | SQ→SPx        | 1    | Flat or gouraud shading                                 |
| SQ_SPx_interp_cyl_wrap     | SQ→SPx        | 4    | Wich channel needs to be cylindrical wrapped            |
| SQ SPx interp param gen    | SQ-→SPx       | 1    | Generate Parameter                                      |
| SQ SPx interp prim type    | <u>SQ→SPx</u> | 2    | Bits [1:0] of primitive type sent by SC                 |
| SQ SPx interp buff swap    | <u>SQ→SPx</u> | 1    | Swapp IJ buffers                                        |
| SQ SPx interp IJ line      | SQ→SPx        | 2    | IJ line number                                          |
| SQ_SPx_interp_mode         | <u>SQ→SPx</u> | 1    | Center/Centroid sampling                                |
| SQ_SXx_pc_ptr0             | SQ→SXx        | 11   | Parameter Cache Pointer                                 |
| SQ_SXx_pc_ptr1             | SQ→SXx        | 11   | Parameter Cache Pointer                                 |
| SQ_SXx_pc_ptr2             | SQ⊸SXx        | 11   | Parameter Cache Pointer                                 |
| SQ_SXx_rt_sel              | SQ→SXx        | 1    | Selects between RT and Normal data (Bit 2 of prim type) |
| SQ SX0 pc wr en            | SQ→SX0        | 8    | Write enable for the PC memories                        |
| SQ_SXx1_pc_wr_en           | SQ→SXxSX1     | 18   | Write enable for the PC memories                        |
| SQ_SXx_pc_wr_addr          | SQ→SXx        | 7    | Write address for the PCs                               |
| SQ_SXx_pc_channel_mask     | SQ→SXx        | 4    | Channel mask                                            |
| SQ_SXx_pc_ptr_valid        | SQ→SXx        | 1    | Read pointers are valid.                                |
| SQ_SPx_interp_valid        | SQ→SPx        | 1    | Interpolation control valid                             |

# 23.2.4 SQ to SP: Staging Register Data

This is a broadcast bus that sends the VSISR information to the staging registers of the shader pipes.

| Name               | Direction | Bits | Description                                            |
|--------------------|-----------|------|--------------------------------------------------------|
| SQ_SPx_vsr_data    | SQ→SPx    | 96   | Pointers of indexes or HOS surface information         |
| SQ_SPx_vsr_double  | SQ→SPx    | 1    | 0: Normal 96 bits per vert 1: double 192 bits per vert |
| SQ_SP0_ vsr_valid  | SQ→SP0    | 1    | Data is valid                                          |
| SQ_SP1_ vsr_ valid | SQ→SP1    | 1    | Data is valid                                          |
| SQ_SP2_vsr_valid   | SQ→SP2    | 1    | Data is valid                                          |
| SQ_SP3_vsr_valid   | SQ→SP3    | 1    | Data is valid                                          |
| SQ_SPx_vsr_read    | SQ→SPx    | 1    | Increment the read pointers                            |

#### 23.2.5 VGT to SQ: Vertex interface

### 23.2.5.1 Interface Signal Table

The area difference between the two methods is not sufficient to warrant complicating the interface or the state requirements of the VSISRs. Therefore, the POR for this interface is that the VGT will transmit the data to the VSISRs (via the Shader Sequencer) in full, 32-bit floating-point format. The VGT can transmit up to six 32-bit floating-point values to each VSISR where four or more values require two transmission clocks. The data bus is 96 bits wide.

| Name                   | Bits | Description                                                                                                                           |  |  |  |  |
|------------------------|------|---------------------------------------------------------------------------------------------------------------------------------------|--|--|--|--|
| VGT_SQ_vsisr_data      | 96   | Pointers of indexes or HOS surface information                                                                                        |  |  |  |  |
| VGT_SQ_event           | 1    | VGT is sending an event                                                                                                               |  |  |  |  |
| VGT_SQ_vsisr_continued | 1    | 0: Normal 96 bits per vert 1: double 192 bits per vert                                                                                |  |  |  |  |
| VGT_SQ_end_of_vtx_vect | 1    | Indicates the last VSISR data set for the current process vector (for double vector data, "end_of_vector" is set on the first vector) |  |  |  |  |
| VGT_SQ_indx_valid      | 1    | Vsisr data is valid                                                                                                                   |  |  |  |  |
| VGT_SQ_state           | 3    | Render State (6*3+3 for constants). This signal is guaranteed to be correct when "VGT_SQ_vgt_end_of_vector" is high.                  |  |  |  |  |
| VGT_SQ_send            | 1    | Data on the VGT_SQ is valid receive (see write-up for standard R400 SEND/RTR interface handshaking)                                   |  |  |  |  |
| SQ_VGT_rtr             | 1    | Ready to receive (see write-up for standard R400 SEND/RTR interface handshaking)                                                      |  |  |  |  |

#### 23.2.5.2 Interface Diagrams





Figure 1. Detailed Logical Diagram for PA SQ vgt Interface.

Enhibit 2033 door R400\_Sequencer doo 73016 Bytes\*\*\* © ATI Confidential. Reference Copyright Notice on Cover Page © \*\*\*

AMD1044\_0257649



EDIT DATE
4 September, 20159

R400 Sequencer Specification

PAGE 44 of 53

# 23.2.6 SQ to SX: Control bus

| Name               | Direction | Bits | Description                                               |
|--------------------|-----------|------|-----------------------------------------------------------|
| SQ SXx exp type    | SQ→SXx    | 2    | 00: Pixel without z (1 to 4 buffers)                      |
|                    |           |      | 01: Pixel with z (1 to 4 buffers)                         |
|                    |           |      | 10: Position (1 or 2 results)                             |
|                    |           |      | 11: Pass thru (4,8 or 12 results aligned)                 |
| SQ_SXx_exp_number  | SQ→SXx    | 2    | Number of locations needed in the export buffer           |
|                    |           |      | (encoding depends on the type see bellow).                |
| SQ_SXx_exp_alu_id  | SQ→SXx    | 1    | ALU ID                                                    |
| SQ_SXx_exp_valid   | SQ→SXx    | 1    | Valid bit                                                 |
| SQ_SXx_exp_state   | SQ→SXx    | 3    | State Context                                             |
| SQ_SXx_free_done   | SQ→SXx    | 1    | Pulse that indicates that the previous export is finished |
|                    |           |      | from the point of view of the SP. This does not           |
|                    |           |      | necessarily mean that the data has been                   |
|                    |           |      | transferred to RB or PA, or that the space in export      |
|                    |           |      | buffer for that particular vector thread has been         |
|                    |           |      | freed up.                                                 |
| SQ_SXx_free_alu_id | SQ→SXx    | 1    | ALU ID                                                    |

Depending on the type the number of export location changes:

- Type 00 : Pixels without Z
  - o 00 = 1 buffer
  - o 01 = 2 buffers
  - o 10 = 3 buffers
  - o 11 = 4 buffer
- Type 01: Pixels with Z
  - 00 = 2 Buffers (color + Z)
  - o 01 = 3 buffers (2 color + Z)
  - o 10 = 4 buffers (3 color + Z)
  - o 11 = 5 buffers (4 color + Z)
- Type 10 : Position export
  - o 00 = 1 position
  - o 01 = 2 positions
  - o 1X = Undefined
- Type 11: Pass Thru
  - 00 = 4 buffers01 = 8 buffers
  - o 10 = 12 buffers
  - o 11 = Undefined

Below the thick black line is the end of transfer packet that tells the SX that a given export is finished. The report packet will always arrive either before or at the same time than the next export to the same ALU id.

#### 23.2.7 SX to SQ: Output file control

| Name                 | Direction | Bits | Description                                                                                                                 |
|----------------------|-----------|------|-----------------------------------------------------------------------------------------------------------------------------|
| SXx_SQ_exp_count_rdy | SXx→SQ    | 1    | Raised by SX0 to indicate that the following two fields reflect the result of the most recent export                        |
| SXx_SQ_exp_pos_avail | SXx→SQ    | 21   | Specifies whether there is room for another position.  00:0 buffers ready  01:1 buffer ready  10:2 or more buffers ready    |
| SXx_SQ_exp_buf_avail | SXx→SQ    | 7    | Specifies the space available in the output buffers.  0: buffers are full  1: 2K-bits available (32-bits for each of the 64 |

| <b>A</b> P2 | ORIGINATE DATE     | EDI | T DATE                         | DOCUMENT-REV. NUM.                                                             | PAGE       |
|-------------|--------------------|-----|--------------------------------|--------------------------------------------------------------------------------|------------|
| 67700       | 24 September, 2001 |     | mber, 20159                    | GEN-CXXXXX-REVA                                                                | 45 of 53   |
|             |                    |     | pixels<br><br>64: 12<br>64 pix | in a clause)<br>8K-bits available (16 128-bit entries f<br>els)<br>7: RESERVED | or each of |

# 23.2.8 SQ to TP: Control bus

Once every clock, the fetch unit sends to the sequencer on which RS line it is now working and if the data in the GPRs is ready or not. This way the sequencer can update the fetch valid bits flags for the reservation station. The sequencer also provides the instruction and constants for the fetch to execute and the address in the register file where to write the fetch return data.

| Name                   | Direction | Bits | Description                                               |
|------------------------|-----------|------|-----------------------------------------------------------|
| TPx_SQ_data_rdy        | TPx→ SQ   | 1    | Data ready                                                |
| TPx_SQ_rs_line_num     | TPx→ SQ   | 6    | Line number in the Reservation station                    |
| TPx_SQ_type            | TPx→ SQ   | 1    | Type of data sent (0:PIXEL, 1:VERTEX)                     |
| SQ_TPx_send            | SQ→TPx    | 1    | Sending valid data                                        |
| SQ_TPx_const           | SQ→TPx    | 48   | Fetch state sent over 4 clocks (192 bits total)           |
| SQ_TPx_instr           | SQ→TPx    | 24   | Fetch instruction sent over 4 clocks                      |
| SQ_TPx_end_of_group    | SQ→TPx    | 1    | Last instruction of the group                             |
| SQ_TPx_Type            | SQ→TPx    | 1    | Type of data sent (0:PIXEL, 1:VERTEX)                     |
| SQ_TPx_gpr_phase       | SQ→TPx    | 2    | Write phase signal                                        |
| SQ_TP0_lod_correct     | SQ→TP0    | 6    | LOD correct 3 bits per comp 2 components per quad         |
| SQ_TP0_pix_mask        | SQ→TP0    | 4    | Pixel mask 1 bit per pixel                                |
| SQ_TP1_lod_correct     | SQ→TP1    | 6    | LOD correct 3 bits per comp 2 components per quad         |
| SQ_TP1_pix_mask        | SQ→TP1    | 4    | Pixel mask 1 bit per pixel                                |
| SQ_TP2_lod_correct     | SQ→TP2    | 6    | LOD correct 3 bits per comp 2 components per quad         |
| SQ_TP2_pix_mask        | SQ→TP2    | 4    | Pixel mask 1 bit per pixel                                |
| SQ_TP3_lod_correct     | SQ→TP3    | 6    | LOD correct 3 bits per comp 2 components per quad         |
| SQ_TP3_pix_mask        | SQ→TP3    | 4    | Pixel mask 1 bit per pixel                                |
| SQ_TPx_rs_line_num     | SQ→TPx    | 6    | Line number in the Reservation station                    |
| SQ_TPx_write_gpr_index | SQ->TPx   | 7    | Index into Register file for write of returned Fetch Data |

# 23.2.9 TP to SQ: Texture stall

The TP sends this signal to the SQ and the SPs when its input buffer is full.





EDIT DATE
4 September, 20159

R400 Sequencer Specification

PAGE 46 of 53

Description

Do not send more texture request if asserted

23.2.10 SQ to SP: Texture stall

| Name               | Direction | Bits | Description                                  |  |
|--------------------|-----------|------|----------------------------------------------|--|
| SQ SPx fetch stall | SQ→SPx    | 1    | Do not send more texture request if asserted |  |

# 23.2.11 SQ to SP: GPR and auto counter

| Name                 | Direction | Bits  | Description                                           |
|----------------------|-----------|-------|-------------------------------------------------------|
| SQ_SPx_gpr_wr_addr   | SQ→SPx    | 7     | Write address                                         |
| SQ_SPx_gpr_rd_addr   | SQ→SPx    | 7     | Read address                                          |
| SQ_SPx_gpr_rd_en     | SQ→SPx    | 1     | Read Enable                                           |
| SQ_SP0_gpr_wr_en     | SQ→SPx    | 14    | Write Enable for the GPRs of SP0                      |
| SQ_SP1_gpr_wr_en     | SQ→SPx    | 14    | Write Enable for the GPRs of SP1                      |
| SQ_SP2_gpr_wr_en     | SQ→SPx    | 14    | Write Enable for the GPRs of SP2                      |
| SQ_SP3_gpr_wr_en     | SQ→SPx    | 14    | Write Enable for the GPRs of SP3                      |
| SQ_SPx_gpr_phase     | SQ→SPx    | 2     | The phase mux (arbitrates between inputs, ALU SRC     |
|                      |           |       | reads and writes)                                     |
| SQ_SPx_channel_mask  | SQ→SPx    | 4     | The channel mask                                      |
| SQ_SPx_gpr_input_sel | SQ→SPx    | 2     | When the phase mux selects the inputs this tells from |
|                      |           |       | which source to read from: Interpolated data, VTX0,   |
|                      |           |       | VTX1, autogen counter.                                |
| SQ_SPx_auto_count    | SQ→SPx    | 12721 | Auto count generated by the SQ, common for all shader |
|                      |           |       | pipes                                                 |



EDIT DATE
4 September, 20159

DOCUMENT-REV. NUM. GEN-CXXXXX-REVA PAGE 47 of 53

23.2.12 SQ to SPx: Instructions

| Name               | Direction | Bits   | Description                                     |
|--------------------|-----------|--------|-------------------------------------------------|
| SQ_SPx_instr_start | SQ→SPx    | 1      | Instruction start                               |
| SQ_SP_instr        | SQ→SPx    | 242    | Transferred over 4 cycles                       |
|                    |           | - Anna | 0: SRC A Select 2:0                             |
|                    |           |        | SRC A Argument Modifier 3:3                     |
|                    |           |        | CDC A cuitado 11.4                              |
|                    |           |        | SRC A swizzle 11:4 VectorDst 17:12              |
|                    |           |        |                                                 |
|                    |           |        | Per channel use mask (PV/Reg) 21:18SRC A Negati |
|                    |           |        | Argument Modifier 0:0                           |
|                    |           |        | SRC A Abs Argument Modifier 1:1                 |
|                    |           |        | SRC A Swizzle 9:2                               |
|                    |           |        | Vector Dst 15:10                                |
|                    |           |        | Per channel Select 23:16                        |
|                    |           |        | 00: GPR                                         |
|                    |           |        | 01: PV                                          |
|                    |           |        | 10: PS                                          |
|                    |           |        | 11: Constant (if 11 has to be 11 for a          |
|                    |           |        | channels)                                       |
|                    |           |        | <u>Charmers)</u>                                |
|                    |           |        |                                                 |
|                    |           |        | 4. CDO D Nameta Assumption 4.5 - 0.0            |
|                    |           |        | 1: SRC B Negate Argument Modifier 0:0           |
|                    |           |        | SRC B Abs Argument Modifier 1:1                 |
|                    |           |        | SRC B Swizzle 9:2                               |
|                    |           |        | Scalar Dst 15:10                                |
|                    |           |        | Per channel Select 23:16                        |
|                    |           |        | 00: GPR                                         |
|                    |           |        | 01: PV                                          |
|                    |           |        | 10: PS                                          |
|                    |           |        | 11: Constant (if 11 has to be 11 for a          |
|                    |           |        | x 4 5                                           |
|                    |           |        | SRC B Select 2:0                                |
|                    |           |        | SRC B Argument Modifier 3:3                     |
|                    |           |        | CDO D i 14-4                                    |
|                    |           |        | SRC B swizzle 11:4 —ScalarDst 17:12             |
|                    |           |        | -SG4134128t                                     |
|                    |           |        | Per channel use mask (PV/Reg) 21:18             |
|                    |           |        | ***************************************         |
|                    |           |        | -                                               |
|                    |           |        | 2: SRC C Negate Argument Modifier 0:0           |
|                    |           |        | SRC C Abs Argument Modifier 1:1                 |
|                    |           |        | SRC C Swizzle 9:2                               |
|                    |           |        | Unused 15:10                                    |
|                    |           |        | Per channel Select 23:16                        |
|                    |           |        | 00: GPR                                         |
|                    |           |        | 01: PV                                          |
|                    |           |        | 10: PS                                          |
|                    |           |        |                                                 |
|                    |           |        | 11: Constant (if 11 has to be 11 for a          |
|                    |           |        | channels) SRC-C Select 2:0                      |
|                    |           |        | SRC-C-Select 2:0                                |
|                    |           |        | SRC C Argument Modifier 3:3                     |
|                    |           |        | SRC C swizzle 11:4                              |
|                    |           |        | Per channel use mask (PV/Reg) 21:18             |
|                    |           |        |                                                 |
|                    |           |        |                                                 |
|                    |           |        | 3: Vector Opcode 4:0                            |
|                    |           |        | Scalar Opcode 10:5                              |
|                    |           |        | Vector Clamp 11:11                              |
|                    |           |        | Scalar Clamp 12:12                              |
|                    |           |        | Vector Write Mask 16:13                         |
|                    |           |        |                                                 |
|                    |           | 1      | Scalar Write Mask 20:17                         |

|                                                                                                                 | NATE DATE                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            | EDIT D                       | ATE                                      | R400 Sequencer Specification                                                                                                                                                                                                                                                                                                                 | PAGE                                        |                                            |
|-----------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------|------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------|--------------------------------------------|
| 24 Sep                                                                                                          | tember, 2001                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | 4 Septembe                   | er, 2015                                 | 2                                                                                                                                                                                                                                                                                                                                            | 48 of 53                                    |                                            |
| SQ SP0 pred overr                                                                                               | ide SQ-                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | →SP0                         | 4                                        | 0: Use per channel RGBA field (enables the logic, if not set only pay attention to the 11: 1: Use GPR                                                                                                                                                                                                                                        | e per channel<br>seting).                   |                                            |
| SQ SP1 pred overr                                                                                               | ide SQ-                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | →SP1                         | 4                                        | 0: Use per channel RGBA field (enables the logic, if not set only pay attention to the 11: Use GPR                                                                                                                                                                                                                                           |                                             |                                            |
| SQ SP2 pred overr                                                                                               | ide SQ-                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | →SP2                         | 4                                        | 0: Use per channel RGBA field (enables the logic, if not set only pay attention to the 11: 1: Use GPR                                                                                                                                                                                                                                        |                                             |                                            |
| SQ SP3 pred overr                                                                                               | ide SQ-                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | →SP3                         | 4                                        | 0: Use per channel RGBA field (enables the logic, if not set only pay attention to the 11: 1: Use GPR                                                                                                                                                                                                                                        |                                             |                                            |
| SQ_SPx_exp_alu_i                                                                                                | d SQ-                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | →SPx                         | 1                                        | GPRALU ID                                                                                                                                                                                                                                                                                                                                    |                                             |                                            |
| SQ_SPx_exporting                                                                                                | SQ-                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  | →SPx                         | 1                                        | 0: Not Exporting 1: Exporting                                                                                                                                                                                                                                                                                                                |                                             |                                            |
| SQ_SPx_stall                                                                                                    | SQ-                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  | →SPx                         | 1                                        | Stall signal                                                                                                                                                                                                                                                                                                                                 |                                             |                                            |
| 2.13 SQ to S.                                                                                                   | X: write ma                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | sk interfac                  | e (mu                                    | st be aligned with the SP data)                                                                                                                                                                                                                                                                                                              | 4-                                          | Formatted: Bullets and Numbering           |
|                                                                                                                 |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |                              |                                          |                                                                                                                                                                                                                                                                                                                                              |                                             |                                            |
| <del>Name</del><br>SQ SX0 write mas                                                                             |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      | ⇒ction<br>→SP0               | Bits                                     | Description  Result of pixel kill in the shader pipe, which is the shader pipe.                                                                                                                                                                                | high much he                                |                                            |
| SQ SXU Write mas                                                                                                | <u>k</u> <u>5u</u> -                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 | <u>→570</u>                  | 8                                        | output for all pixel exports (depth a buffers). 4x4 because 16 pixels are coclock. This is for the data coming of SPO                                                                                                                                                                                                                        | nd all color<br>omputed per                 |                                            |
| SQ_SX1_ write_mas                                                                                               | ik SQ-                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | →SP1                         | 8                                        | Result of pixel kill in the shader pipe, whoutput for all pixel exports (depth a buffers). 4x4 because 16 pixels are coclock. This is for the data coming of SP1                                                                                                                                                                             | hich must be<br>nd all color<br>omputed per |                                            |
|                                                                                                                 |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |                              |                                          | s load/ Predicate Set/Kill set                                                                                                                                                                                                                                                                                                               | 4-                                          | { Formatted: Bullets and Numbering         |
| Name                                                                                                            | Directi                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |                              |                                          | Description                                                                                                                                                                                                                                                                                                                                  | I (A hito anh a)/                           |                                            |
| SP0_SQ_const_addi                                                                                               |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |                              |                                          | Constant address load / predicate vector load<br>Kill vector load (4 bits only) to the sequencer                                                                                                                                                                                                                                             | (4 bits only)[                              |                                            |
| SP0_SQ_valid                                                                                                    | SP0→                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 |                              |                                          | Data valid                                                                                                                                                                                                                                                                                                                                   |                                             |                                            |
|                                                                                                                 |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |                              |                                          | Constant address load / predicate vector load                                                                                                                                                                                                                                                                                                | d (4 bits only)/                            |                                            |
| SF1_SQ_const_addi                                                                                               | SP1→                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 | 5Q                           |                                          | Kill vector load (4 bits only) to the sequencer                                                                                                                                                                                                                                                                                              |                                             |                                            |
| SP1_SQ_valid                                                                                                    | SP1→                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 | SQ                           | 1                                        | Data valid                                                                                                                                                                                                                                                                                                                                   |                                             |                                            |
| SP1_SQ_valid<br>SP2_SQ_const_addr                                                                               | SP1→S                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | SQ<br>SQ                     | 1 36                                     | Data valid<br>Constant address load / predicate vector loac<br>Kill vector load (4 bits only) to the sequencer                                                                                                                                                                                                                               | d (4 bits only)/                            |                                            |
| SP1_SQ_valid<br>SP2_SQ_const_addr<br>SP2_SQ_valid                                                               | SP1→:<br>SP2→:<br>SP2→:                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | SQ<br>SQ                     | 1<br>36                                  | Data valid<br>Constant address load / predicate vector load<br>Kill vector load (4 bits only) to the sequencer<br>Data valid                                                                                                                                                                                                                 |                                             |                                            |
| SP1_SQ_valid<br>SP2_SQ_const_addr<br>SP2_SQ_valid<br>SP3_SQ_const_addr                                          | SP1→ <br>  SP2→ <br>  SP2→ <br>  SP3→                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | SQ<br>SQ<br>SQ<br>SQ         | 1<br>36<br>1<br>36                       | Data valid  Constant address load / predicate vector load  Kill vector load (4 bits only) to the sequencer  Data valid  Constant address load / predicate vector load  Kill vector load (4 bits only) to the sequencer                                                                                                                       |                                             |                                            |
| SP1_SQ_valid<br>SP2_SQ_const_addr<br>SP2_SQ_valid<br>SP3_SQ_const_addr<br>SP3_SQ_valid                          | SP1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  | SQ<br>SQ<br>SQ<br>SQ         | 1<br>36<br>1<br>36                       | Data valid  Constant address load / predicate vector load  Kill vector load (4 bits only) to the sequencer  Data valid  Constant address load / predicate vector load  Kill vector load (4 bits only) to the sequencer  Data valid                                                                                                           |                                             |                                            |
| SP1_SQ_valid<br>SP2_SQ_const_addr<br>SP2_SQ_valid<br>SP3_SQ_const_addr<br>SP3_SQ_valid                          | SP1→ <br>  SP2→ <br>  SP2→ <br>  SP3→                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | SQ<br>SQ<br>SQ<br>SQ         | 1<br>36<br>1<br>36<br>1<br>4 <u>2</u>    | Data valid  Constant address load / predicate vector load  Kill vector load (4 bits only) to the sequencer  Data valid  Constant address load / predicate vector load  Kill vector load (4 bits only) to the sequencer                                                                                                                       |                                             |                                            |
| SP1_SQ_valid<br>SP2_SQ_const_addr<br>SP2_SQ_valid<br>SP3_SQ_const_addr<br>SP3_SQ_valid<br>SP0_SQ_data_type      | SP1→: SP2→: SP2→: SP3→: SP3→: SP3→: SP3→:                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            | SQ<br>SQ<br>SQ<br>SQ<br>Q    | 1<br>36<br>1<br>36<br>1<br>42            | Data valid  Constant address load / predicate vector load  Kill vector load (4 bits only) to the sequencer  Data valid  Constant address load / predicate vector load  Kill vector load (4 bits only) to the sequencer  Data valid  Data Type  D: Constant Load  1: Predicate Set  2: Kill vector load                                       | d (4 bits only)/                            |                                            |
| SP1_SQ_valid<br>SP2_SQ_const_addr<br>SP2_SQ_valid<br>SP3_SQ_const_addr<br>SP3_SQ_valid<br>SP0_SQ_data_type      | SP1→: SP2→: SP2→: SP3→: SP3→: SP3→: SP3→:                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            | SQ<br>SQ<br>SQ<br>SQ<br>Q    | 1<br>36<br>1<br>36<br>1<br>42            | Data valid  Constant address load / predicate vector load  Kill vector load (4 bits only) to the sequencer  Data valid  Constant address load / predicate vector load  Kill vector load (4 bits only) to the sequencer  Data valid  Data Type  D: Constant Load  1: Predicate Set                                                            | d (4 bits only)/                            | Formatted                                  |
| SP1_SQ_valid<br>SP2_SQ_const_addr<br>SP2_SQ_valid<br>SP3_SQ_const_addr<br>SP3_SQ_valid<br>SP0_SQ_data_type      | $\begin{array}{c} \text{SP1} \rightarrow \text{SP2} \rightarrow \text{SP2} \rightarrow \text{SP3} \rightarrow \text{SP4} \rightarrow SP4$ | SQ<br>SQ<br>SQ<br>SQ<br>Q    | 1<br>36<br>1<br>36<br>1<br>42            | Data valid  Constant address load / predicate vector load (Kill vector load (4 bits only) to the sequencer Data valid  Constant address load / predicate vector load (Kill vector load (4 bits only) to the sequencer Data valid  Data Type  D: Constant Load  1: Predicate Set  2: Kill vector load  DSET or KILL instructions may be coiss | d (4 bits only)/                            | Formatted Formatted: Bullets and Numbering |
| SP1_SQ_valid SP2_SQ_const_addr SP2_SQ_valid SP3_SQ_const_addr SP3_SQ_valid SP0_SQ_data_type ause of the sharing | $SP1\rightarrow SP2\rightarrow SP2\rightarrow SP3\rightarrow SP3\rightarrow SP3\rightarrow SP3\rightarrow SP3\rightarrow SP3\rightarrow SP3\rightarrow SP3$                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | SQ<br>SQ<br>SQ<br>SQ<br>Q    | 1<br>36<br>1<br>36<br>1<br>42<br>/A, PRE | Data valid  Constant address load / predicate vector load Kill vector load (4 bits only) to the sequencer Data valid  Constant address load / predicate vector load Kill vector load (4 bits only) to the sequencer Data valid  Data Type  O: Constant Load  1: Predicate Set  2: Kill vector load  DSET or KILL instructions may be coiss   | d (4 bits only)/                            |                                            |
|                                                                                                                 | $\begin{array}{c} \text{SP1} \rightarrow \text{SP2} \rightarrow \text{SP2} \rightarrow \text{SP3} \rightarrow \text{SP4} \rightarrow SP4$ | SQ SQ SQ SQ Q one of the MOV | 1 36 1 36 1 42 A, PRE                    | Data valid  Constant address load / predicate vector load (Kill vector load (4 bits only) to the sequencer Data valid  Constant address load / predicate vector load (Kill vector load (4 bits only) to the sequencer Data valid  Data Type  D: Constant Load  1: Predicate Set  2: Kill vector load  DSET or KILL instructions may be coiss | d (4 bits only)/                            |                                            |



EDIT DATE
4 September, 20159

DOCUMENT-REV. NUM. GEN-CXXXXX-REVA PAGE 49 of 53

23.2.15 SP0 to SQ: Kill vector load

#### 23.2.16 SQ to CP: RBBM bus

| Name           | Direction | Bits | Description          |   |
|----------------|-----------|------|----------------------|---|
| SQ_RBB_rs      | SQ→CP     | 1    | Read Strobe          | ٦ |
| SQ_RBB_rd      | SQ→CP     | 32   | Read Data            | 1 |
| SQ_RBBM_nrtrtr | SQ→CP     | 1    | Optional             | ٦ |
| SQ_RBBM_rtr    | SQ→CP     | 1    | Real-Time (Optional) | 1 |

#### 23.2.17 CP to SQ: RBBM bus

| Name               | Direction                                   | Bits | Description          |  |
|--------------------|---------------------------------------------|------|----------------------|--|
| rbbm_we            | CP→SQ                                       | 1    | Write Enable         |  |
| rbbm_a             | CP→SQ 15 Address Upper Extent is TBD (16:2) |      |                      |  |
| rbbm_wd            | CP→SQ                                       | 32   | Data                 |  |
| rbbm_be            | CP→SQ                                       | 4    | Byte Enables         |  |
| rbbm_re            | CP→SQ                                       | 1    | Read Enable          |  |
| rbb_rs0            | CP→SQ                                       | 1    | Read Return Strobe 0 |  |
| rbb_rs1            | CP→SQ                                       | 1    | Read Return Strobe 1 |  |
| rbb_rd0            | CP→SQ                                       | 32   | Read Data 0          |  |
| rbb_rd1            | CP→SQ                                       | 32   | Read Data 0          |  |
| RBBM_SQ_soft_reset | CP→SQ                                       | 1    | Soft Reset           |  |

# 23.2.18 SQ to CP: State report

| Name             | Direction | Bits | Description            | ٦  |
|------------------|-----------|------|------------------------|----|
| SQ_CP_vs_event   | SQ→CP     | 1    | Vertex Shader Event    | 7  |
| SQ_CP_vs_eventid | SQ→CP     | 42   | Vertex Shader Event ID | 71 |
| SQ_CP_ps_event   | SQ→CP     | 1    | Pixel Shader Event     |    |
| SQ_CP_ps_eventid | SQ→CP     | 42   | Pixel Shader Event ID  |    |

eventid = 0 => \*sEndOfState (i.e. VsEndOfState) eventid = 1 => \*sDone (i.e. VsDone)

So, the CP will assume the Vs is done with a state whenever it gets a pulse on the SQ\_CP\_vs\_event and the SQ\_CP\_vs\_eventid = 0.

# 23.3 Example of control flow program execution

We now provide some examples of execution to better illustrate the new design.

Given the program:

Alu 0

Alu 1

Tex 0 Tex 1

Alu 3 Serial

Alu 3 Seria Alu 4

Tex 2

Alu 5

Alu 6 Serial Tex 3

Alu 7

Alloc Position 1 buffer

Alu 8 Export

Tex 4

Exhibit 2033.docR400\_Sequences.doc 73016 Bytes\*\*\* © ATI Confidential. Reference Copyright Notice on Cover Page © \*\*\*

Formatted: Bullets and Numbering

AMD1044\_0257655



EDIT DATE 4 September, 20159

R400 Sequencer Specification

PAGE 50 of 53

Alloc Parameter 3 buffers Alu 9 Export 0 Tex 5 Alu 10 Serial Export 2 Alu 11 Export 1 End

Would be converted into the following CF instructions:

Execute O Alu O Alu O Tex O Tex 1 Alu O Alu O Tex O Alu 1 Alu O Tex Execute 0 Alu Alloc Position 1 Execute 0 Alu 0 Tex Alloc Param 3 Execute end 0 Alu 0 Tex 1 Alu 0 Alu

And the execution of this program would look like this:

Put thread in Vertex RS:

Control Flow Instruction Pointer (12 bits), (CFP) Execution Count Marker (3 or 4 bits), (ECM) Loop Iterators (4x9 bits), (LI) Call return pointers (4x12 bits), (CRP) Predicate Bits(4x64 bits), (PB) Export ID (1 bit), (EXID) GPR Base Ptr (8 bits), (GPR) Export Base Ptr (7 bits), (EB) Context Ptr (3 bits).(CPTR) LOD correction bits (16x6 bits) (LOD)

| State Bits | 1   |    |     |    |      |     |    |      |     |
|------------|-----|----|-----|----|------|-----|----|------|-----|
| CFP        | ECM | LI | CRP | РВ | EXID | GPR | EB | CPTR | LOD |
| 0          | 0   | 0  | 0   | 0  | 0    | 0   | 0  | 0    | 0   |

Valid Thread (VALID) Texture/ALU engine needed (TYPE) Texture Reads are outstanding (PENDING) Waiting on Texture Read to Complete (SERIAL) Allocation Wait (2 bits) (ALLOC)

00 - No allocation needed

01 - Position export allocation needed (ordered export)

10 - Parameter or pixel export needed (ordered export)

11 – pass thru (out of order export)

Allocation Size (4 bits) (SIZE)

| Position Allo | ocated (POS_ALLOC)       |
|---------------|--------------------------|
| First thread  | of a new context (FIRST) |
| Last (1 bit), | (LAST)                   |
|               |                          |

| Status Bits |      |         |        |       |      |           |       |      |  |
|-------------|------|---------|--------|-------|------|-----------|-------|------|--|
| VALID       | TYPE | PENDING | SERIAL | ALLOC | SIZE | POS_ALLOC | FIRST | LAST |  |
| 1           | ALU  | 0       | 0      | 0     | 0    | 0         | 1     | 0    |  |

Then the thread is picked up for the execution of the first control flow instruction:

Execute 0 Alu 0 Alu 0 Tex 0 Tex 1 Alu 0 Alu 0 Tex 0 Alu 1 Alu 0 Tex

It executes the first two ALU instructions and goes back to the RS for a resource request change. Here is the state returned to the RS:

Exhibit 2033 docR400\_Sequencer.doc 73016 Bytes\*\*\* © ATI Confidential. Reference Copyright Notice on Cover Page © \*\*\*

AMD1044\_0257656



| ORIGINATE DATE     | EDIT DATE          | DOCUMENT-REV. NUM. |
|--------------------|--------------------|--------------------|
| 24 September, 2001 | 4 September, 20159 | GEN-CXXXXX-REVA    |

PAGE 51 of 53

|     |     |   | MANAGON DE LA COMPANION DE LA | AND |      |     |      |          |      |
|-----|-----|---|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------|------|-----|------|----------|------|
| CEP | ECM |   | CRP                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           | PB                                      | FXID | GPR | I FR | CPTR     | LLOD |
|     |     |   |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |                                         |      |     |      | <u> </u> |      |
| 0   | 2   | 0 | 0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             | 0                                       | 0    | 10  | 0    | 0        | 10   |
| -   |     |   | •                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             | -                                       | -    | -   | -    | -        | -    |

| Status                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | Bits                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| TOTAL CONTRACTOR OF THE PARTY O | ON THE PARTY OF TH |

| Otatus Dits | Otatios Dite |         |        |       |      |           |       |      |  |
|-------------|--------------|---------|--------|-------|------|-----------|-------|------|--|
| VALID       | TYPE         | PENDING | SERIAL | ALLOC | SIZE | POS_ALLOC | FIRST | LAST |  |
| 1           | TEX          | 0       | 0      | 0     | 0    | 0         | 1     | 0    |  |

Then when the texture pipe frees up, the arbiter picks up the thread to issue the texture reads. The thread comes back in this state:

| State | Bits |
|-------|------|
|       | D.K. |
| 0.00  |      |

| L | State Bits                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |     |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |     |                                                          |      |                         |    |      |     |  |
|---|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----|----------------------------------------------------------|------|-------------------------|----|------|-----|--|
| 1 | MARKET AND THE PARTY OF THE PAR |     | MAGESTORY THE THE TANK THE TAN |     | 9/19/2/2/2017/10/19/19/19/19/19/19/19/19/19/19/19/19/19/ |      | ANN MARKATER CONTRACTOR |    |      |     |  |
|   | CFP                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            | ECM | LI                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             | CRP | PB                                                       | EXID | GPR                     | EB | CPTR | LOD |  |
| ľ | 0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | 4   | 0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | 0   | 0                                                        | 0    | 0                       | 0  | 0    | 0   |  |

Status Bits

| Status Dits                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    |      |         |        |       |                                            |           |                                                                                                                |      |
|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------|---------|--------|-------|--------------------------------------------|-----------|----------------------------------------------------------------------------------------------------------------|------|
| SOMETHING STATES OF THE STATES |      |         |        |       | NAMES OF THE OWNER, WHEN PERSONS ASSESSED. |           | NACTORISM CONTRACTORISM CONTRACTORISM CONTRACTORISM CONTRACTORISM CONTRACTORISM CONTRACTORISM CONTRACTORISM CO |      |
| VALID                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | TYPE | PENDING | SERIAL | ALLOC | SIZE                                       | POS_ALLOC | FIRST                                                                                                          | LAST |
| 1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | ALU  | 1       | 1      | 0     | 0                                          | 0         | 1                                                                                                              | 0    |

Because of the serial bit the arbiter must wait for the texture to return and clear the PENDING bit before it can pick the thread up. Lets say that the texture reads are complete, then the arbiter picks up the thread and returns it in

State Bits

| DESCRIPTION OF THE PROPERTY OF |     |    |     |    |      |     |    |      |     |  |  |
|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----|----|-----|----|------|-----|----|------|-----|--|--|
| CFP                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            | ECM | LI | CRP | PB | EXID | GPR | EB | CPTR | LOD |  |  |
| 0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | 6   | 0  | 0   | 0  | 0    | 0   | 0  | 0    | 0   |  |  |

Status Bits

| Otatas Dits |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |         |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |       |      |           |       |      |  |
|-------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------|------|-----------|-------|------|--|
|             | CONTRACTOR OF THE PARTY OF THE |         | DESCRIPTION OF THE PROPERTY OF |       |      |           |       |      |  |
| VALID       | TYPE                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | PENDING | SERIAL                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         | ALLOC | SIZE | POS_ALLOC | FIRST | LAST |  |
| 1           | TEX                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           | 0       | 0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | 0     | 0    | 0         | 1     | 0    |  |

Again the TP frees up, the arbiter picks up the thread and executes. It returns in this state:

State Rits

| State Dits |     |    |     |    |      |     |    |      |     |  |  |  |
|------------|-----|----|-----|----|------|-----|----|------|-----|--|--|--|
| CFP        | ECM | LI | CRP | PB | EXID | GPR | EB | CPTR | LOD |  |  |  |
| 0          | 7   | 0  | 0   | 0  | 0    | 0   | 0  | 0    | 0   |  |  |  |

| Status Bits |      |         |        |       |      |           |       |      |  |
|-------------|------|---------|--------|-------|------|-----------|-------|------|--|
| VALID       | TYPE | PENDING | SERIAL | ALLOC | SIZE | POS_ALLOC | FIRST | LAST |  |
| 1           | ALU  | 1       | 0      | 0     | 0    | 0         | 1     | 0    |  |

Now, even if the texture has not returned we can still pick up the thread for ALU execution because the serial bit is not set. The thread will however come back to the RS for the second ALU instruction because it has the serial bit

| State Bits |     |    | NO. NO. OF THE PARTY OF THE PAR |    |      |     |    |      |     |  |
|------------|-----|----|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----|------|-----|----|------|-----|--|
| CFP        | ECM | LI | CRP                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            | РВ | EXID | GPR | EB | CPTR | LOD |  |
| 0          | 8   | 0  | 0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | 0  | 0    | 0   | 0  | 0    | 0   |  |



EDIT DATE 4 September, 20159 R400 Sequencer Specification

PAGE 52 of 53

Status Bits

| - 1 |       |      |         |        |       | TOTAL PROPERTY AND A STREET |           |       |      |
|-----|-------|------|---------|--------|-------|-----------------------------|-----------|-------|------|
|     | VALID | TYPE | PENDING | SERIAL | ALLOC | SIZE                        | POS_ALLOC | FIRST | LAST |
|     | 1     | ALU  | 1       | 1      | 0     | 0                           | 0         | 1     | 0    |

As soon as the TP clears the pending bit the thread is picked up and returns:

| State | Bits |  |
|-------|------|--|
|       |      |  |

|        | Juic Dita                       |     |    |     |    |      |     |    |      |     |
|--------|---------------------------------|-----|----|-----|----|------|-----|----|------|-----|
| 100000 | NOOMANIA MARKATIIIIII IRRAA SAA |     |    |     |    |      |     |    |      |     |
|        | CFP                             | ECM | LI | CRP | PB | EXID | GPR | EB | CPTR | LOD |
| C      | )                               | 9   | 0  | 0   | 0  | 0    | 0   | 0  | 0    | 0   |

Status Bits

| TYPE | PENDING     | SERIAL       | ALLOC               |                           | POS ALLOC                      | FIRST                                    | LAST                                           |
|------|-------------|--------------|---------------------|---------------------------|--------------------------------|------------------------------------------|------------------------------------------------|
|      | •           |              |                     | -                         |                                | 4                                        | -                                              |
| IEX  | 0           | 0            | 0                   | U                         | 0                              | 1                                        | 0                                              |
|      | TYPE<br>TEX | TYPE PENDING | TYPE PENDING SERIAL | TYPE PENDING SERIAL ALLOC | TYPE PENDING SERIAL ALLOC SIZE | TYPE PENDING SERIAL ALLOC SIZE POS_ALLOC | TYPE PENDING SERIAL ALLOC SIZE POS_ALLOC FIRST |

#### Picked up by the TP and returns:

Execute 0 Alu

| State | Bits |  |
|-------|------|--|

| State Bits |     |    |     |    |      |     |    |      |     |  |
|------------|-----|----|-----|----|------|-----|----|------|-----|--|
| CFP        | ECM | LI | CRP | РВ | EXID | GPR | EB | CPTR | LOD |  |
| 1          | 0   | 0  | 0   | 0  | 0    | 0   | 0  | 0    | 0   |  |

Ctatus Dita

| Status Dita | Julius Dits |         |        |       |      |           |       |      |  |  |  |
|-------------|-------------|---------|--------|-------|------|-----------|-------|------|--|--|--|
| VALID       | TYPE        | PENDING | SERIAL | ALLOC | SIZE | POS_ALLOC | FIRST | LAST |  |  |  |
| 1           | ALU         | 1       | 0      | 0     | 0    | 0         | 1     | 0    |  |  |  |

Picked up by the ALU and returns (lets say the TP has not returned yet):

Alloc Position 1

| S | tate | Bits |  |
|---|------|------|--|
| 0 | late | DILO |  |

| State Bits |     |    |     |    |      |     |    |      |     |
|------------|-----|----|-----|----|------|-----|----|------|-----|
| CFP        | ECM | LI | CRP | PB | EXID | GPR | EB | CPTR | LOD |
| 2          | 0   | 0  | 0   | 0  | 0    | 0   | 0  | 0    | 0   |

| Status Bit | ts   |         |        |       |      |           |       |      |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
|------------|------|---------|--------|-------|------|-----------|-------|------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| VALID      | TYPE | PENDING | SERIAL | ALLOC | SIZE | POS_ALLOC | FIRST | LAST | STATE OF THE PARTY |
| 1          | ΔΙΙΙ | 1       | 0      | 01    | 1    | 0         | 1     | 10   |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |

If the SX has the place for the export, the SQ is going to allocate and pick up the thread for execution. It returns to the RS in this state:

Execute 0 Alu 0 Tex

| C4.           | -4- | D: | 4- |
|---------------|-----|----|----|
| <b>-</b> D (c | nе  |    | ιs |

| State Bits |     |    |     |    |      |     |    |      |     |
|------------|-----|----|-----|----|------|-----|----|------|-----|
| CFP        | ECM | LI | CRP | РВ | EXID | GPR | EB | CPTR | LOD |
| 3          | 1   | 0  | 0   | 0  | 0    | 0   | 0  | 0    | 0   |

Status Bits

| Status Dits |      |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |        |       |              |           |       |      |
|-------------|------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------|-------|--------------|-----------|-------|------|
|             |      | CHARLES MANUFACTURE STATE OF THE STATE OF TH |        |       | UKWANIKE SAK |           |       |      |
| VALID       | TYPE | PENDING                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        | SERIAL | ALLOC | SIZE         | POS_ALLOC | FIRST | LAST |
| 1           | TEX  | 1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | 0      | 0     | 0            | 1         | 1     | 0    |



EDIT DATE
4 September, 20159

DOCUMENT-REV. NUM. GEN-CXXXXX-REVA

PAGE 53 of 53

Now, since the TP has not returned yet, we must wait for it to return because we cannot issue multiple texture requests. The TP returns, clears the PENDING bit and we proceed:

Alloc Param 3

| State B | its |    |     |    |      |     |    |      |     |
|---------|-----|----|-----|----|------|-----|----|------|-----|
| CFP     | ECM | LI | CRP | PB | EXID | GPR | EB | CPTR | LOD |
| 4       | n   | n  | n   | n  | 1    | n   | n  | n    | Ω   |

| Status Bi | ts   |         |        |       |      |           |       |      |
|-----------|------|---------|--------|-------|------|-----------|-------|------|
| VALID     | TYPE | PENDING | SERIAL | ALLOC | SIZE | POS_ALLOC | FIRST | LAST |
| 1         | ALU  | 1       | 0      | 10    | 3    | 1         | 1     | 0    |

Once again the SQ makes sure the SX has enough room in the Parameter cache before it can pick up this thread.

Execute\_end 0 Alu 0 Tex 1 Alu 0 Alu

| State Bits |     |    |     |    |      |     |     |      |     |
|------------|-----|----|-----|----|------|-----|-----|------|-----|
| CFP        | ECM | LI | CRP | РВ | EXID | GPR | EB  | CPTR | LOD |
| 5          | 1   | 0  | 0   | 0  | 1    | 0   | 100 | 0    | 0   |

| Status Bi | ts   |         |        |       |      |           |       |      |
|-----------|------|---------|--------|-------|------|-----------|-------|------|
| VALID     | TYPE | PENDING | SERIAL | ALLOC | SIZE | POS_ALLOC | FIRST | LAST |
| 1         | TEX  | 1       | 0      | 0     | 0    | 1         | 1     | 0    |

This executes on the TP and then returns:

| State Bits |     |    |     |    |      |     |     |      |     |
|------------|-----|----|-----|----|------|-----|-----|------|-----|
| CFP        | ECM | LI | CRP | РВ | EXID | GPR | EB  | CPTR | LOD |
| 5          | 2   | 0  | 0   | 0  | 1    | 0   | 100 | 0    | 0   |

| Status Bi | ts   |         |        |       |      |           |       |      |  |
|-----------|------|---------|--------|-------|------|-----------|-------|------|--|
| VALID     | TYPE | PENDING | SERIAL | ALLOC | SIZE | POS_ALLOC | FIRST | LAST |  |
| 1         | ALU  | 1       | 1      | 0     | 0    | 1         | 1     | 1    |  |

Waits for the TP to return because of the textures reads are pending (and SERIAL in this case). Then executes and does not return to the RS because the LAST bit is set. This is the end of this thread and before dropping it on the floor, the SQ notifies the SX of export completion.

#### 24. Open issues

Need to do some testing on the size of the register file as well as on the register file allocation method (dynamic VS static).

Saving power?

| Æ         |
|-----------|
| Author:   |
| Issue To: |

EDIT DATE
4 September, 201544

DOCUMENT-REV. NUM. GEN-CXXXXX-REVA PAGE 1 of 51

Laurent Lefebvre

| ssue To: | Copy No: |  |
|----------|----------|--|
|          | ł        |  |

# **R400 Sequencer Specification**

# SQ

#### Version 2.065

Overview: This is an architectural specification for the R400 Sequencer block (SEQ). It provides an overview of the required capabilities and expected uses of the block. It also describes the block interfaces, internal subblocks, and provides internal state diagrams.

AUTOMATICALLY UPDATED FIELDS:

Document Location: C:\perforce\r400\doc\_lib\design\blocks\sq\R400\_Sequencer.doc

Current Intranet Search Title: R400 Sequencer Specification

|           | APPROVALS |                |
|-----------|-----------|----------------|
| Name/Dept |           | Signature/Date |
|           |           |                |
|           |           |                |
|           |           |                |
|           |           |                |
|           |           |                |

Remarks:

THIS DOCUMENT CONTAINS CONFIDENTIAL INFORMATION THAT COULD BE SUBSTANTIALLY DETRIMENTAL TO THE INTEREST OF ATI TECHNOLOGIES INC. THROUGH UNAUTHORIZED USE OR DISCLOSURE.

"Copyright 2001, ATI Technologies Inc. All rights reserved. The material in this document constitutes an unpublished work created in 2001. The use of this copyright notice is intended to provide notice that ATI owns a copyright in this unpublished work. The copyright notice is not an admission that publication has occurred. This work contains confidential, proprietary information and trade secrets of ATI. No part of this document may be used, reproduced, or transmitted in any form or by any means without the prior written permission of ATI Technologies Inc."

Exhibit 2034.docR400\_Sequence.doc 73366 Bytes\*\*\* ATI Confidential. Reference Copyright Notice on Cover Page © \*\*\*

ATI 2034 LG v. ATI IPR2015-00325

AMD1044\_0257660



EDIT DATE
4 September, 201544

R400 Sequencer Specification

PAGE 2 of 51

# Table Of Contents

| 1.<br>1.1<br>1.2<br>1.3<br>2.<br>3. | OVERVIEW Top Level Block Diagram. Data Flow graph (SP). Control Graph. INTERPOLATED DATA BUS. INSTRUCTION STORE SEQUENCER INSTRUCTIONS. | 9101111 |
|-------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------|---------|
| 5.                                  | CONSTANT STORES                                                                                                                         |         |
| 5.1                                 | Memory organizations                                                                                                                    |         |
| 5.2                                 | Management of the Control Flow Constants                                                                                                |         |
| 5.3                                 | Management of the re-mapping tables                                                                                                     |         |
|                                     | 3.1 R400 Constant management                                                                                                            |         |
| 5.3                                 | 3.2 Proposal for R400LE constant management                                                                                             | 15      |
| 5.3                                 | 3.3 Dirty bits                                                                                                                          | 17      |
| 5.3                                 | 3.4 Free List Block                                                                                                                     | 17      |
| 5.3                                 | 3.5 De-allocate Block                                                                                                                   |         |
| 5 3                                 | 3.6 Operation of Incremental model                                                                                                      |         |
| 5.4                                 | Constant Store Indexing                                                                                                                 |         |
| 5.5                                 | Real Time Commands                                                                                                                      |         |
| 5.6                                 | Constant Waterfalling                                                                                                                   |         |
| 6.                                  | LOOPING AND BRANCHES                                                                                                                    |         |
| 6.1                                 | The controlling stateThe Control Flow Program                                                                                           |         |
|                                     | 2.1 Control flow instructions table                                                                                                     |         |
| 6.3                                 | Implementation                                                                                                                          |         |
| 6.4                                 | Data dependant predicate instructions                                                                                                   | 24      |
| 6.5                                 | HW Detection of PV,PS                                                                                                                   |         |
| 6.6                                 | Register file indexing                                                                                                                  |         |
| 6.7                                 | Debugging the Shaders                                                                                                                   |         |
|                                     | 7.1 Method 1: Debugging registers                                                                                                       |         |
|                                     | 7.2 Method 2: Exporting the values in the GPRs                                                                                          |         |
| 7.                                  | PIXEL KILL MASK                                                                                                                         |         |
| 8.<br>9.                            | MULTIPASS VERTEX SHADERS (HOS)REGISTER FILE ALLOCATION                                                                                  |         |
| 10.                                 | FETCH ARBITRATION                                                                                                                       |         |
| 11.                                 | ALU ARBITRATION                                                                                                                         |         |
| 12.                                 | HANDLING STALLS                                                                                                                         | 29      |
| 13.                                 | CONTENT OF THE RESERVATION STATION FIFOS                                                                                                |         |
| 14.<br>15.                          | THE OUTPUT FILEIJ FORMAT                                                                                                                |         |
| 15.1                                |                                                                                                                                         |         |
| 16.                                 | STAGING REGISTERS                                                                                                                       |         |
|                                     |                                                                                                                                         |         |



AMD1044\_0257662



EDIT DATE 4 September, 201511 R400 Sequencer Specification

PAGE 4 of 51

OPEN ISSUES.......51



EDIT DATE 4 September, 201544 DOCUMENT-REV. NUM. GEN-CXXXXX-REVA

PAGE 5 of 51

# Revision Changes:

Rev 0.1 (Laurent Lefebvre) Date: May 7, 2001

Rev 0.2 (Laurent Lefebvre) Date: July 9, 2001 Rev 0.3 (Laurent Lefebvre) Date: August 6, 2001 Rev 0.4 (Laurent Lefebvre) Date: August 24, 2001

Rev 0.5 (Laurent Lefebvre) Date: September 7, 2001 Rev 0.6 (Laurent Lefebvre) Date: September 24, 2001 Rev 0.7 (Laurent Lefebvre) Date: October 5, 2001

Rev 0.8 (Laurent Lefebvre) Date: October 8, 2001 Rev 0.9 (Laurent Lefebvre) Date: October 17, 2001

Rev 1.0 (Laurent Lefebvre) Date: October 19, 2001 Rev 1.1 (Laurent Lefebvre) Date: October 26, 2001

Rev 1.2 (Laurent Lefebvre) Date: November 16, 2001 Rev 1.3 (Laurent Lefebvre) Date: November 26, 2001 Rev 1.4 (Laurent Lefebvre) Date: December 6, 2001

Rev 1.5 (Laurent Lefebvre) Date: December 11, 2001

Rev 1.6 (Laurent Lefebvre) Date: January 7, 2002

Rev 1.7 (Laurent Lefebvre) Date: February 4, 2002 Rev 1.8 (Laurent Lefebvre) Date: March 4, 2002

Rev 1.9 (Laurent Lefebvre) Date: March 18, 2002 Rev 1.10 (Laurent Lefebyre) Date: March 25, 2002 Rev 1.11 (Laurent Lefebvre) Date: April 19, 2002 Rev 2.0 (Laurent Lefebvre) Date: April 19, 2002

First draft

Changed the interfaces to reflect the changes in the SP. Added some details in the arbitration section. Reviewed the Sequencer spec after the meeting on August 3, 2001.

Added the dynamic allocation method for register file and an example (written in part by Vic) of the flow of pixels/vertices in the sequencer. Added timing diagrams (Vic)

Changed the spec to reflect the new R400 architecture. Added interfaces.

Added constant store management, instruction store management, control flow management and data dependant predication.

Changed the control flow method to be more flexible. Also updated the external interfaces. Incorporated changes made in the 10/18/01 control flow meeting. Added a NOP instruction, removed

the conditional\_execute\_or\_jump. Added debug registers. Refined interfaces to RB. Added state registers.

Added SEQ-SP0 interfaces. Changed delta precision. Changed VGT-SP0 interface. Debug Methods added.

Interfaces greatly refined. Cleaned up the spec.

Added the different interpolation modes.

Added the auto incrementing counters. Changed the VGT-SQ interface. Added content on constant management. Updated GPRs.

Removed from the spec all interfaces that weren't directly tied to the SQ. Added explanations on constant management. Added synchronization fields and explanation.

Added more details on the staging register. Added detail about the parameter caches. Changed the call instruction to a Conditionnal call instruction. Added details on constant management and updated the diagram.

Added Real Time parameter control in the SX interface. Updated the control flow section.

New interfaces to the SX block. Added the end of clause modifier, removed the end of clause instructions.

Rearangement of the CF instruction bits in order to ensure byte alignement.

Updated the interfaces and added a section on exporting rules.

Added CP state report interface. Last version of the spec with the old control flow scheme

New control flow scheme



ORIGINATE DATE

24 September, 2001

EDIT DATE 4 September, 201511 R400 Sequencer Specification

**PAGE** 6 of 51

Rev 2.01 (Laurent Lefebvre) Date : May 2, 2002 Rev 2.02 (Laurent Lefebvre)

Date: May 13, 2002

Rev 2.03 (Laurent Lefebvre) Date: July 15, 2002

Rev 2.04 (Laurent Lefebvre) Date :August 2, 2002 Rev 2.05 (Laurent Lefebvre) Date: September 10, 2002 Rev 2.06 (Laurent Lefebvre) Date: October 11, 2002

Changed slightly the control flow instructions to allow force jumps and calls.

Updated the Opcodes. Added type field to the constant/pred interface. Added Last field to the SQ→SP instruction load interface.

SP interface updated to include predication optimizations. Added the predicate no stall instructions,

Documented the new parameter generation scheme for XY coordinates points and lines STs.

Some interface changes and an architectural

change to the auto-counter scheme.

Widened the event interface to 5 bits. Some other

little typos corrected.



EDIT DATE

4 September, 201544

DOCUMENT-REV. NUM. GEN-CXXXXX-REVA

PAGE 7 of 51

### 1. Overview

The sequencer chooses two ALU threads and a fetch hread to execute, and executes all of the instructions in a block before looking for a new clause of the same type. Two ALU threads are executed interleaved to hide the ALU latency. The arbitrator will give priority to older threads. There are two separate reservation stations, one for pixel vectors and one for vertices vectors. This way a pixel can pass a vertex and a vertex can pass a pixel.

To support the shader pipe the sequencer also contains the shader instruction cache, constant store, control flow constants and texture state. The four shader pipes also execute the same instruction thus there is only one sequencer for the whole chip.

The sequencer first arbitrates between vectors of 64 vertices that arrive directly from primitive assembly and vectors of 16 quads (64 pixels) that are generated in the scan converter.

The vertex or pixel program specifies how many GPRs it needs to execute. The sequencer will not start the next vector until the needed space is available in the GPRs.



AMD1044\_0257667



EDIT DATE 4 September, 201544 DOCUMENT-REV. NUM. GEN-CXXXXX-REVA PAGE 9 of 51

# 1.1 Top Level Block Diagram



Figure 2: Reservation stations and arbiters

Under this new scheme, the sequencer (SQ) will only use one global state management machine per vector type (pixel, vertex) that we call the reservation station (RS).





EDIT DATE
4 September, 201511

DOCUMENT-REV. NUM. GEN-CXXXXX-REVA PAGE 11 of 51

The gray area represents blocks that are replicated 4 times per shader pipe (16 times on the overall chip).

# 1.3 Control Graph



Figure 4: Sequencer Control interfaces

In green is represented the Fetch control interface, in red the ALU control interface, in blue the Interpolated/Vector control interface and in purple is the output file control interface.

# 2. Interpolated data bus

The interpolators contain an IJ buffer to pack the information as much as possible before writing it to the register file.



|                    |                     | 83                              |          | I   | I             | 1        | l                       | - 410                  | 1 - 4 O                   |                        | ı     |
|--------------------|---------------------|---------------------------------|----------|-----|---------------|----------|-------------------------|------------------------|---------------------------|------------------------|-------|
|                    |                     | 2 12                            |          |     |               |          | V V<br>32- 48-<br>35 51 | V V<br>36-52-<br>39 55 | V V<br>40- 56-<br>43 59   | V V<br>44-60-<br>47 63 | -×    |
|                    |                     | <u> </u>                        |          |     |               |          | 33.                     |                        |                           |                        |       |
|                    |                     | 7                               |          |     |               |          | > 16-                   | 2 % 8                  | > 24-27                   | 28 <                   | >     |
|                    |                     | 120                             |          |     |               |          | \<br>0-3                | > 4                    | >% =                      | > 4 4                  |       |
|                    |                     | 119                             |          |     |               |          |                         |                        | வ                         | 四                      |       |
|                    |                     | T16 T17 T18 T19 T20 T21 T22 T23 |          |     |               |          |                         |                        |                           | 8                      |       |
| Щ                  | 51                  | T17                             |          |     | ≿ 🖺           | ⋛ӹ       |                         | පි                     | 2                         | 8                      |       |
| PAGE               | 13 of 51            | T16                             |          |     | 8             | <u>—</u> |                         |                        |                           | 80                     | 2     |
|                    |                     | T15                             |          |     | 品             | 끄        | 5                       | 05                     |                           |                        | Ω     |
| NO.                | Y.                  | T14                             | ≥≅       | ≿8  |               | ≳8       | ຮ                       | 2                      | 55                        |                        |       |
| REV. I             | XX-RE               | T13                             | 2        | D2  |               | 8        | B1                      |                        |                           |                        |       |
| DOCUMENT-REV. NUM. | GEN-CXXXXX-REVA     | T10 T11 T12 T13                 | 5        | 22  |               | 8        | A0                      | A1                     | A2                        |                        |       |
| OCOL               | GEN-                | 171                             | Ĺ        | ≱ 2 | ≿ 55          | סבעשם    |                         |                        | E0                        | Д                      |       |
|                    |                     | T10                             | Ī        | 2   | C5            | Ö        | 2                       |                        |                           | 8                      |       |
|                    | #                   | <u>6</u>                        |          | 2   | C5            |          |                         | ខ                      | 5                         | 8                      |       |
| 11                 | 2015                | <u>&amp;</u>                    | ≳ క      |     | ≽ Շ           | ≳ છ      |                         |                        |                           | 80                     | Manne |
| EDIT DATE          | 4 September, 201511 | 1                               | ឌ        | 8   | $\mathcal{D}$ | 2        | 10                      | 02                     |                           |                        | Ω     |
| Ш                  | 4 Sept              | <b>T</b> 0                      | ខ        | පි  | ઇ             | 2        | ខ                       | 2                      | 55                        |                        |       |
|                    | * 1                 | T5                              | ¥ ₩      |     |               | ≿ 8      | 81                      |                        |                           |                        |       |
| ATE                | 2001                | <b>T</b> 4                      | <u>B</u> |     |               | 80       | A0                      | A1                     | <b>A</b> 2                |                        |       |
| ATE [              | mber,               | 73                              | 9        |     |               | B0       | X<br>48-<br>51          | ×<br>52-<br>55-        | X≺<br>56-<br>59           | ≥ % %                  |       |
| ORIGINATE DATE     | 24 September, 2001  | 12                              | ≿ 8      | ≵ ₹ | ≽ &           |          | 35.4₹                   | \$ 8<br>8<br>8         | XY XY<br>40- 56-<br>43 59 | × 4 4 7 4              | >     |
| 0                  | 24                  | F                               | Ao       | A   | A2            |          | ×<br>16-<br>19          | \$ 8 ₹                 | × 45                      | ₹ % ₹                  | X     |
| @C                 |                     | 10                              | Ao       | Ā   | A2            |          | ×<br>6-3                | × × 7-4                | <b>≯</b> %                | ¥ 2-25                 |       |
| / <                | ₹                   |                                 | SP<br>0  | g – | SP <          | SP<br>3  | SP<br>0                 | SP -                   | SP<br>2                   | S S                    |       |

Figure 6: Interpolation timing diagram

оовчор Sequencendoo 73385 Вуюв\*\*\* © ATI Confidential. Reference Copyright Notice on Cover Page © \*\*\*



ORIGINATE DATE

EDIT DATE

R400 Sequencer Specification

PAGE 14 of 51

24 September, 2001

4 September, 201511

Above is an example of a tile the sequencer might receive from the SC. The write side is how the data get stacked into the XY and IJ buffers, the read side is how the data is passed to the GPRs. The IJ information is packed in the IJ buffer 4 quads at a time or two clocks. The sequencer allows at any given time as many as four quads to interpolate a parameter. They all have to come from the same primitive. Then the sequencer controls the write mask to the GPRs to write the valid data in.

### 3. Instruction Store

There is going to be only one instruction store for the whole chip. It will contain 4096 instructions of 96 bits each.

It is likely to be a 1 port memory; we use 1 clock to load the ALU instruction, 1 clocks to load the Fetch instruction, 1 clock to load 2 control flow instructions and 1 clock to write instructions.

The instruction store is loaded by the CP thru the register mapped registers.

The VS\_BASE and PS\_BASE context registers are used to specify for each context where its shader is in the instruction memory.

For the Real time commands the story is quite the same but for some small differences. There are no wrap-around points for real time so the driver must be careful not to overwrite regular shader data. The shared code (shared subroutines) uses the same path as real time.

### 4. Sequencer Instructions

All control flow instructions and move instructions are handled by the sequencer only. The ALUs will perform NOPs during this time (MOV PV,PV, PS,PS) if they have nothing else to do.

### 5. Constant Stores

### 5.1 Memory organizations

A likely size for the ALU constant store is 1024x128 bits. The read BW from the ALU constant store is 128 bits/clock and the write bandwidth is 32 bits/clock (directed by the CP bus size not by memory ports).

The maximum logical size of the constant store for a given shader is 256 constants. Or 512 for the pixel/vertex shader pair. The size of the re-mapping table is 128 lines (each line addresses 4 constants). The write granularity is 4 constants or 512 bits. It takes 16 clocks to write the four constants. Real time requires 256 lines in the physical memory (this is physically register mapped).

The texture state is also kept in a similar memory. The size of this memory is 320x96 bits (128 texture states for regular mode, 32 states for RT). The memory thus holds 128 texture states (192 bits per state). The logical size exposes 32 different states total, which are going to be shared between the pixel and the vertex shader. The size of the re-mapping table to for the texture state memory is 32 lines (each line addresses 1 texture state lines in the real memory). The CP write granularity is 1 texture state lines (or 192 bits). The driver sends 512 bits but the CP ignores the top 320 bits. It thus takes 6 clocks to write the texture state. Real time requires 32 lines in the physical memory (this is physically register mapped).

The control flow constant memory doesn't sit behind a renaming table. It is register mapped and thus the driver must reload its content each time there is a change in the control flow constants. Its size is 320\*32 because it must hold 8 copies of the 32 dwords of control flow constants and the loop construct constants must be aligned.

The constant re-mapping tables for texture state and ALU constants are logically register mapped for regular mode and physically register mapped for RT operation.



EDIT DATE
4 September, 201544

DOCUMENT-REV. NUM.
GEN-CXXXXX-REVA

PAGE 15 of 51

### 5.2 Management of the Control Flow Constants

The control flow constants are register mapped, thus the CP writes to the according register to set the constant, the SQ decodes the address and writes to the block pointed by its current base pointer (CF\_WR\_BASE). On the read side, one level of indirection is used. A register (SQ\_CONTEXT\_MISC.CF\_RD\_BASE) keeps the current base pointer to the control flow block. This register is copied whenever there is a state change. Should the CP write to CF after the state change, the base register is updated with the (current pointer number +1)% number of states. This way, if the CP doesn't write to CF the state is going to use the previous CF constants.

# 5.3 Management of the re-mapping tables

### 5.3.1 R400 Constant management

The sequencer is responsible to manage two re-mapping tables (one for the constant store and one for the texture state). On a state change (by the driver), the sequencer will broadside copy the contents of its re-mapping tables to a new one. We have 8 different re-mapping tables we can use concurrently.

The constant memory update will be incremental, the driver only need to update the constants that actually changed between the two state changes.

For this model to work in its simplest form, the requirement is that the physical memory MUST be at least twice as large as the logical address space + the space allocated for Real Time. In our case, since the logical address space is 512 and the reserved RT space can be up to 256 entries, the memory must be of sizes 1280 and above. Similarly the size of the texture store must be of 32\*2+32 = 96 entries and above.

### 5.3.2 Proposal for R400LE constant management

To make this scheme work with only 512+256 = 768 entries, upon reception of a CONTROL packet of state + 1, the sequencer would check for SQ\_IDLE and PA\_IDLE and if both are idle will erase the content of state to replace it with the new state (this is depicted in Figure 8: De-allocation mechanismFigure 8: De-allocation mechanism). Note that in the case a state is cleared a value of 0 is written to the corresponding de-allocation counter location so that when the SQ is going to report a state change, nothing will be de-allocated upon the first report.

The second path sets all context dirty bits that were used in the current state to 1 (thus allowing the new state to reuse these physical addresses if needed).





EDIT DATE
4 September, 201511

DOCUMENT-REV. NUM. GEN-CXXXXX-REVA PAGE 17 of 51



Figure 8: De-allocation mechanism for R400LE

### 5.3.3 Dirty bits

Two sets of dirty bits will be maintained per logical address. The first one will be set to zero on reset and set when the logical address is addressed. The second one will be set to zero whenever a new context is written and set for each address written while in this context. The reset dirty is not set, then writing to that logical address will not require de-allocation of whatever address stored in the renaming table. If it is set and the context dirty is not set, then the physical address store needs to be de-allocated and a new physical address is necessary to store the incoming data. If they are both set, then the data will be written into the physical address held in the renaming for the current logical address. No de-allocation or allocation takes place. This will happen when the driver does a set constant twice to the same logical address between context changes. NOTE: It is important to detect and prevent this, failure to do it will allow multiple writes to allocate all physical memory and thus hang because a context will not fit for rendering to start and thus free up space.

#### 5.3.4 Free List Block

A free list block that would consist of a counter (called the IFC or Initial Free Counter) that would reset to zero and incremented every time a chunk of physical memory is used until they have all been used once. This counter would be checked each time a physical block is needed, and if the original ones have not been used up, us a new one, else check the free list for an available physical block address. The count is the physical address for when getting a chunk from the counter.

Storage of a free list big enough to store all physical block addresses.

Maintain three pointers for the free list that are reset to zero. The first one we will call write\_ptr. This pointer will identify the next location to write the physical address of a block to be de-allocated. Note: we can never free more physical memory locations than we have. Once recording address the pointer will be incremented to walk the free list like a ring

The second pointer will be called stop\_ptr. The stop\_ptr pointer will be advanced by the number of address chunks de-allocates when a context finishes. The address between the stop\_ptr and write\_ptr cannot be reused because they are still in use. But as soon as the context using then is dismissed the stop\_ptr will be advanced.

The third pointer will be called read\_ptr. This pointer will point will point to the next address that can be used for allocation as long as the read\_ptr does not equal the stop\_ptr and the IFC is at its maximum count.



ORIGINATE DATE

24 September, 2001

EDIT DATE
4 September, 201511

R400 Sequencer Specification

PAGE 18 of 51

### 5.3.5 De-allocate Block

This block will maintain a free physical address block count for each context. While in current context, a count shall be maintained specifying how many blocks were written into the free list at the write\_ptr pointer. This count will be reset upon reset or when this context is active on the back and different than the previous context. It is actually a count of blocks in the previous context that will no longer be used. This count will be used to advance the write\_ptr pointer to make available the set of physical blocks freed when the previous context was done. This allows the discard or de-allocation of any number of blocks in one clock.

#### 5.3.6 Operation of Incremental model

The basic operation of the model would start with the write\_ptr, stop\_ptr, read\_ptr pointers in the free list set to zero and the free list counter is set to zero. Also all the dirty bits and the previous context will be initialized to zero. When the first set constants happen, the reset dirty bit will not be set, so we will allocate a physical location from the free list counter because its not at the max value. The data will be written into physical address zero. Both the additional copy of the renaming table and the context zeros of the big renaming table will be updated for the logical address that was written by set start with physical address of 0. This process will be repeated for any logical address that are not dirty until the context changes. If a logical address is hit that has its dirty bits set while in the same context, both dirty bits would be set, so the new data will be over-written to the last physical address assigned for this logical address. When the first draw command of the context is detected, the previous context stored in the additional renaming table will be copied to the larger renaming table in the current (new) context location. Then the set constant logical address with be loaded with a new physical address during the copy and if the reset dirty was set, the physical address it replaced in the renaming table would be entered at the write\_ptr pointer location on the free list and the write\_ptr will be incremented. The de-allocation counter for the previous context (eight) will be incremented. This as set states come in for this context one of the following will happen:

- No dirty bits are set for the logical address being updated. A line will be allocated of the free-list counter or the free list at read\_ptr pointer if read\_ptr != to stop\_ptr.
- Reset dirty set and Context dirty not set. A new physical address is allocated, the physical address in the renaming table is put on the free list at write\_ptr and it is incremented along with the de-allocate counter for the last context
- 3.) Context dirty is set then the data will be written into the physical address specified by the logical address.

This process will continue as long as set states arrive. This block will provide backpressure to the CP whenever he has not free list entries available (counter at max and stop\_ptr == read\_ptr). The command stream will keep a count of contexts of constants in use and prevent more than max constants contexts from being sent.

Whenever a draw packet arrives, the content of the re-mapping table is written to the correct re-mapping table for the context number. Also if the next context uses less constants than the current one all exceeding lines are moved to the free list to be de-allocated later. This happens in parallel with the writing of the re-mapping table to the correct memory.

Now preferable when the constant context leaves the last ALU clause it will be sent to this block and compared with the previous context that left. (Init to zero) If they differ than the older context will no longer be referenced and thus can be de-allocated in the physical memory. This is accomplished by adding the number of blocks freed this context to the stop\_ptr pointer. This will make all the physical addresses used by this context available to the read\_ptr allocate pointer for future allocation.

This device allows representation of multiple contexts of constants data with N copies of the logical address space. It also allows the second context to be represented as the first set plus some new additional data by just storing the delta's. It allows memory to be efficiently used and when the constants updates are small it can store multiple context. However, if the updates are large, less contexts will be stored and potentially performance will be degraded. Although it will still perform as well as a ring could in this case.

#### 5.4 Constant Store Indexing

In order to do constant store indexing, the sequencer must be loaded first with the indexes (that come from the GPRs). There are 144 wires from the exit of the SP to the sequencer (9 bits pointers x 16 vertexes/clock). Since the data must pass thru the Shader pipe for the float to fixed conversion, there is a latency of 4 clocks (1 instruction)



EDIT DATE
4 September, 201544

DOCUMENT-REV. NUM. GEN-CXXXXX-REVA PAGE 19 of 51

between the time the sequencer is loaded and the time one can index into the constant store. The assembly will look like this

MOVA R1.X,R2.X // Loads the sequencer with the content of R2.X, also copies the content of R2.X into R1.X NOP // latency of the float to fixed conversion

ADD R3,R4,C0[R2.X]// Uses the state from the sequencer to add R4 to C0[R2.X] into R3

Note that we don't really care about what is in the brackets because we use the state from the MOVA instruction. R2.X is just written again for the sake of simplicity and coherency.

The storage needed in the sequencer in order to support this feature is 2\*64\*9 bits = 1152 bits.

#### 5.5 Real Time Commands

The real time commands constants are written by the CP using the register mapped registers allocated for RT. It works is the same way than when dealing with regular constant loads BUT in this case the CP is not sending a logical address but rather a physical address and the reads are not passing thru the re-mapping table but are directly read from the memory. The boundary between the two zones is defined by the CONST\_EO\_RT control register. Similarly, for the fetch state, the boundary between the two zones is defined by the TSTATE\_EO\_RT control register.

### 5.6 Constant Waterfalling

In order to have a reasonable performance in the case of constant store indexing using the address register, we are going to have the possibility of using the physical memory port for read only. This way we can read 1 constant per clock and thus have a worst-case waterfall mode of 1 vertex per clock. There is a small synchronization issue related with this as we need for the SQ to make sure that the constants where actually written to memory (not only sent to the sequencer) before it can allow the first vector of pixels or vertices of the state to go thru the ALUs. To do so, the sequencer keeps 8 bits (one per render state) and sets the bits whenever the last render state is written to memory and clears the bit whenever a state is freed.



Figure 9: The Constant store



EDIT DATE
4 September, 201544

R400 Sequencer Specification

PAGE 20 of 51

### 6. Looping and Branches

Loops and branches are planned to be supported and will have to be dealt with at the sequencer level. We plan on supporting constant loops and branches using a control program.

### 6.1 The controlling state.

The R400 controling state consists of:

Boolean[256:0] Loop\_count[7:0][31:0] Loop\_Start[7:0][31:0] Loop\_Step[7:0][31:0]

That is 256 Booleans and 32 loops.

We have a stack of 4 elements for nested calls of subroutines and 4 loop counters to allow for nested loops.

This state is available on a per shader program basis.

#### 6.2 The Control Flow Program

We'd like to be able to code up a program of the form:

- 1: Loop
  2: Exec TexFetch
  3: TexFetch
  4: ALU
  5: ALU
  6: TexFetch
  7: End Loop
  8: ALU Export
- But realize that 3: may be dependent on 2: and 4: is almost certainly dependent on 2: and 3:. Without clausing, these dependencies need to be expressed in the Control Flow instructions. Additionally, without separate 'texture clauses' and 'ALU clauses' we need to know which instructions to dispatch to the Texture Unit and which to the ALU unit. This information will be encapsulated in the flow control instructions.

Each control flow instruction will contain 2 bits of information for each (non-control flow) instruction:

- a) ALU or Texture
- b) Serialize Execution

(b) would force the thread to stop execution at this point (before the instruction is executed) and wait until all textures have been fetched. Given the allocation of reserved bits, this would mean that the count of an 'Exec' instruction would be limited to about 8 (non-control-flow) instructions. If more than this were needed, a second Exec (with the same conditions) would be issued.

Another function that relies upon 'clauses' is allocation and order of execution. We need to assure that pixels and vertices are exported in the correct order (even if not all execution is ordered) and that space in the output buffers are allocated in order. Additionally data can't be exported until space is allocated. A new control flow instruction:

Alloc <buffer select -- position,parameter, pixel or vertex memory. And the size required>.

would be created to mark where such allocation needs to be done. To assure allocation is done in order, the actual allocation for a given thread can not be performed unless the equivalent allocation for all previous threads is already completed. The implementation would also assure that execution of instruction(s) following the serialization due to the Alloc will occur in order -- at least until the next serialization or change from ALU to Texture. In most cases this will allow the exports to occur without any further synchronization. Only 'final' allocations or position allocations are



EDIT DATE
4 September, 201544

DOCUMENT-REV. NUM. GEN-CXXXXX-REVA PAGE 21 of 51

guaranteed to be ordered. Because strict ordering is required for pixels, parameters and positions, this implies only a single alloc for these structures. Vertex exports to memory do not require ordering during allocation and so multiple 'allocs' may be done.

#### 6.2.1 Control flow instructions table

Here is the revised control flow instruction set.

Note that whenever a field is marked as RESERVED, it is assumed that all the bits of the field are cleared (0).

|               |            | NOP      | 1 |  |  |  |  |
|---------------|------------|----------|---|--|--|--|--|
| 47 44 43 42 0 |            |          |   |  |  |  |  |
| 0000          | Addressing | RESERVED | ] |  |  |  |  |

This is a regular NOP.

|       | Execute    |          |                                           |      |              |  |  |  |  |  |
|-------|------------|----------|-------------------------------------------|------|--------------|--|--|--|--|--|
| 47 44 | 43         | 40 34    | 33 16                                     | 1512 | 11 0         |  |  |  |  |  |
| 0001  | Addressing | RESERVED | RESERVED Instructions type + serialize (9 |      | Exec Address |  |  |  |  |  |
|       |            |          | instructions)                             |      |              |  |  |  |  |  |

|   | Execute_End |            |          |                                  |       |              |  |  |  |
|---|-------------|------------|----------|----------------------------------|-------|--------------|--|--|--|
| 4 | 47 44       | 43         | 40 34    | 3316                             | 1512  | 11 0         |  |  |  |
|   | 0010        | Addressing | RESERVED | Instructions type + serialize (9 | Count | Exec Address |  |  |  |
|   |             |            |          | instructions)                    |       |              |  |  |  |

Execute up to 9 instructions at the specified address in the instruction memory. The Instruction type field tells the sequencer the type of the instruction (LSB) (1 = Texture, 0 = ALU and whether to serialize or not the execution (MSB) (1 = Serialize, 0 = Non-Serialized). If Execute\_End this is the last execution block of the shader program.

| Conditional_Execute |            |           |         |                                  |       |              |  |  |  |
|---------------------|------------|-----------|---------|----------------------------------|-------|--------------|--|--|--|
| 47 44               | 43         | 42        | 41 34   | 3316                             | 1512  | 11 0         |  |  |  |
| 0011                | Addressing | Condition | Boolean | Instructions type + serialize (9 | Count | Exec Address |  |  |  |
|                     |            |           | address | instructions)                    |       |              |  |  |  |

| Conditional_Execute_End          |            |                           |         |                                  |       |              |  |  |  |
|----------------------------------|------------|---------------------------|---------|----------------------------------|-------|--------------|--|--|--|
| 47 44 43 42 41 34 3316 1512 11 0 |            |                           |         |                                  |       |              |  |  |  |
| 0100                             | Addressing | ssing Condition Boolean I |         | Instructions type + serialize (9 | Count | Exec Address |  |  |  |
|                                  |            |                           | address | instructions)                    |       |              |  |  |  |

If the specified Boolean (8 bits can address 256 Booleans) meets the specified condition then execute the specified instructions (up to 9 instructions). If the condition is not met, we go on to the next control flow instruction. If Conditional\_Execute\_End and the condition is met, this is the last execution block of the shader program.

| Conditional_Execute_Predicates |                           |    |          |                  |                               |       |              |  |  |  |  |
|--------------------------------|---------------------------|----|----------|------------------|-------------------------------|-------|--------------|--|--|--|--|
| 47 44                          | 43                        | 42 | 41 36    | 35 34            | 3316                          | 1512  | 11 0         |  |  |  |  |
| 0101                           | 0101 Addressing Condition |    | RESERVED | Predicate vector | Instructions type + serialize | Count | Exec Address |  |  |  |  |
|                                |                           |    |          | 100101           | (9 instructions)              |       |              |  |  |  |  |

| Conditional_Execute_Predicates_End |                                        |          |                  |                                  |                  |              |  |  |  |  |
|------------------------------------|----------------------------------------|----------|------------------|----------------------------------|------------------|--------------|--|--|--|--|
| 47 44                              | 47 44 43 42 41 36 35 34 3316 1512 11 0 |          |                  |                                  |                  |              |  |  |  |  |
| 0110 Addressing Condition          |                                        | RESERVED | Predicate vector | Instructions<br>type + serialize | Count            | Exec Address |  |  |  |  |
|                                    |                                        |          |                  |                                  | (9 instructions) |              |  |  |  |  |

Check the AND/OR of all current predicate bits. If AND/OR matches the condition execute the specified number of instructions. We need to AND/OR this with the kill mask in order not to consider the pixels that aren't valid. If the



EDIT DATE
4 September, 201544

R400 Sequencer Specification

PAGE 22 of 51

condition is not met, we go on to the next control flow instruction. If Conditional\_Execute\_Predicates\_End and the condition is met, this is the last execution block of the shader program.

| Conditional_Execute_Predicates_No_Stall |  |  |  |  |  |  |  |  |  |
|-----------------------------------------|--|--|--|--|--|--|--|--|--|
| 47 44 43 42 41 36 35 34 3316 1512 11 0  |  |  |  |  |  |  |  |  |  |
| Iress                                   |  |  |  |  |  |  |  |  |  |
| -                                       |  |  |  |  |  |  |  |  |  |

|       | Conditional_Execute_Predicates_No_Stall_End |           |          |           |                  |       |              |  |  |  |
|-------|---------------------------------------------|-----------|----------|-----------|------------------|-------|--------------|--|--|--|
| 47 44 | 43                                          | 42        | 41 36    | 35 34     | 3316             | 1512  | 11 0         |  |  |  |
| 1110  | Addressing                                  | Condition | RESERVED | Predicate | Instructions     | Count | Exec Address |  |  |  |
|       |                                             |           |          | vector    | type + serialize |       |              |  |  |  |
|       |                                             |           |          |           | (9 instructions) |       |              |  |  |  |

Same as Conditionnal\_Execute\_Predicates but the SQ is not going to wait for the predicate vector to be updated. You can only set this in the compiler if you know that the predicate set is only a refinement of the current one (like a nested if) because the optimization would still work.

| Loop_Start Coop_Start          |            |          |         |          |              |  |  |  |
|--------------------------------|------------|----------|---------|----------|--------------|--|--|--|
| 47 44 43 42 21 20 16 1512 11 0 |            |          |         |          |              |  |  |  |
| 0111                           | Addressing | RESERVED | loop ID | RESERVED | Jump address |  |  |  |

Loop Start. Compares the loop iterator with the end value. If loop condition not met jump to the address. Forward jump only. Also computes the index value. The loop id must match between the start to end, and also indicates which control flow constants should be used with the loop.

|       | Loop_End                              |          |                 |         |          |               |  |  |  |
|-------|---------------------------------------|----------|-----------------|---------|----------|---------------|--|--|--|
| 47 44 | 47 44 43 42 24 23 21 20 16 15 12 11 0 |          |                 |         |          |               |  |  |  |
| 1000  | Addressing                            | RESERVED | Predicate break | loop ID | RESERVED | start address |  |  |  |

Loop end. Increments the counter by one, compares the loop count with the end value. If loop condition met, continue, else, jump BACK to the start of the loop. If predicate break != 0, then compares predicate vector n (specified by predicate break number). If all bits cleared then break the loop.

The way this is described does not prevent nested loops, and the inclusion of the loop id make this easy to do.

|       | Conditionnal_Call |           |                 |          |            |              |  |  |  |  |
|-------|-------------------|-----------|-----------------|----------|------------|--------------|--|--|--|--|
| 47 44 | 43                | 42        | 41 34           | 33 13    | 12         | 11 0         |  |  |  |  |
| 1001  | Addressing        | Condition | Boolean address | RESERVED | Force Call | Jump address |  |  |  |  |

If the condition is met, jumps to the specified address and pushes the control flow program counter on the stack. If force call is set the condition is ignored and the call is made always.

|   |      |            | Return   |    |
|---|------|------------|----------|----|
| 4 | 7 44 | 43         | 42 0     | 1  |
|   | 1010 | Addressing | RESERVED | 18 |

Pops the topmost address from the stack and jumps to that address. If nothing is on the stack, the program will just continue to the next instruction.

| Conditionnal_Jump |            |           |         |         |          |            |              |  |
|-------------------|------------|-----------|---------|---------|----------|------------|--------------|--|
| 47 44             | 43         | 42        | 41 34   | 33      | 32 13    | 12         | 11 0         |  |
| 1011              | Addressing | Condition | Boolean | FW only | RESERVED | Force Jump | Jump address |  |

If force jump is set the condition is ignored and the jump is made always. If FW only is set then only forward jumps are allowed.



EDIT DATE
4 September, 201541

DOCUMENT-REV. NUM. GEN-CXXXXX-REVA PAGE 23 of 51

|       | Allocate |               |          |      |  |  |  |  |  |  |
|-------|----------|---------------|----------|------|--|--|--|--|--|--|
| 47 44 | 43       | 4241          | 40 3     | 20   |  |  |  |  |  |  |
| 1100  | Debug    | Buffer Select | RESERVED | Size |  |  |  |  |  |  |

Buffer Select takes a value of the following:

- 01 position export (ordered export)
- 10 parameter cache or pixel export (ordered export)
- 11 pass thru (out of order exports).

Size field is only used to reserve space in the export buffer for pass thru exports. Valid values are 1 (1 line) thru 9 (9 lines). It should be determined by the compiler/assembler by taking max index used +1.

If debug is set this is a debug alloc (ignore if debug DB\_ON register is set to off).

### 6.3 Implementation

The envisioned implementation has a buffer that maintains the state of each thread. A thread lives in a given location in the buffer during its entire life, but the buffer has FIFO qualities in that threads leave in the order that they enter. Actually two buffers are maintained -- one for Vertices and one for Pixels. The intended implementation would allow for:

16 entries for vertices

48 entries for pixels.

From each buffer, arbitration logic attempts to select 1 thread for the texture unit and 1 (interleaved) thread for the ALU unit. Once a thread is selected it is read out of the buffer, marked as invalid, and submitted to appropriate execution unit. It is returned to the buffer (at the same place) with its status updated once all possible sequential instructions have been executed. A switch from ALU to TEX or visa-versa or a Serialize\_Execution modifier forces the thread to be returned to the buffer.

Each entry in the buffer will be stored across two physical pieces of memory - most bits will be stored in a 1 read port device. Only bits needed for thread arbitration will be stored in a highly multi-ported structure. The bits kept in the 1 read port device will be termed 'state'. The bits kept in the multi-read ported device will be termed 'status'.

'State Bits' needed include:

- 1. Control Flow Instruction Pointer (13 bits),
- 2. Execution Count Marker 4 bits),
- 3. Loop Iterators (4x9 bits),
- 4. Call return pointers (4x12 bits),
- 5. Predicate Bits (64 bits),
- 6. Export ID (1 bit),
- 7. Parameter Cache base Ptr (7 bits),
- 8. GPR Base Ptr (8 bits),
- 9. Context Ptr (3 bits).
- 10. LOD corrections (6x16 bits)
- 11. Valid bits (64 bits)
- 12. RT (1 bit) Signifies that this thread is a Real Time thread. This bit must be sent to the Constant store state machine when reading it.

Absent from this list are 'Index' pointers. These are costly enough that I'm presuming that they are instead stored in the GPRs. The first seven fields above (Control Flow Ptr, Execution Count, Loop Counts, call return ptrs, Predicate bits, PC base ptr and export ID) are updated every time the thread is returned to the buffer based on how much progress has been mode on thread execution. GPR Base Ptr, Context Ptr and LOD corrections are unchanged throughout execution of the thread.



DATE EDIT DATE

4 September, 201544

R400 Sequencer Specification

PAGE 24 of 51

'Status Bits' needed include:

- Valid Thread
- · Texture/ALU engine needed
- Texture Reads are outstanding
- Waiting on Texture Read to Complete
- Allocation Wait (2 bits)
- 00 No allocation needed
- 01 Position export allocation needed (ordered export)
- 10 Parameter or pixel export needed (ordered export)
- 11 pass thru (out of order export)
- Allocation Size (4 bits)
- Position Allocated
- Mem/Color Allocated
- First thread of a new context
- · Event thread (NULL thread that needs to trickle down the pipe)
- Last (1 bit)
- Pulse SX (1 bit)

All of the above fields from all of the entries go into the arbitration circuitry. The arbitration circuitry will select a winner for both the Texture Engine and for the ALU engine. There are actually two sets of arbitration -- one for pixels and one for vertices. A final selection is then done between the two. But the rest of this implementation summary only considers the 'first' level selection which is similar for both pixels and vertices.

Texture arbitration requires no allocation or ordering so it is purely based on selecting the 'oldest' thread that requires the Texture Engine.

ALU arbitration is a little more complicated. First, only threads where either of Texture\_Reads\_outstanding or Waiting\_on\_Texture\_Read\_to\_Complete are '0' are considered. Then if Allocation\_Wait is active, these threads are further filtered based on whether space is available. If the allocation is position allocation, then the thread is only considered if all 'older' threads have already done their position allocation (position allocated bits set). If the allocation is parameter or pixel allocation, then the thread is only considered if it is the oldest thread. Also a thread is not considered if it is a parameter or pixel or position allocation, has its First\_thread\_of\_a\_new\_context bit set and would cause ALU interleaving with another thread performing the same parameter or pixel or position allocation. Finally the 'oldest' of the threads that pass through the above filters is selected. If the thread needed to allocate, then at this time the allocation is done, based on Allocation\_Size. If a thread has its "last" bit set, then it is also removed from the buffer, never to return.

If I now redefine 'clauses' to mean 'how many times the thread is removed from the thread buffer for the purpose of exection by either the ALU or Texture engine', then the minimum number of clauses needed is 2—one to perform the allocation for exports (execution automatically halts after an 'Alloc' instruction) (but doesn't performs the actual allocation) and one for the actual ALU/export instructions. As the 'Alloc' instruction could be part of a texture clause (presumably the final instruction in such a clause), a thread could still execute in this minimal number of 2 clauses, even if it involved texture fetching.

The Texture\_Reads\_Outstanding bit must be updated by the sequencer, based on keeping track of how many Texture Clauses have been executed by a given thread that have not yet had there data returned. Any number above 0 results in this bit being set. We could consider forcing synchronization such that two texture clauses for a given thread may not be outstanding at any time (that would be my preference for simplicity reasons and because it would require only very little change in the texture pipe interface). This would allow the sequencer to set the bit on execution of the texture clause, and allow the texture unit to return a pointer to the thread buffer on completion that clears the bit.

### 6.4 Data dependant predicate instructions

Data dependant conditionals will be supported in the R400. The only way we plan to support those is by supporting three vector/scalar predicate operations of the form:

Exhibit 2034\_docR400\_Sequencer.doc 73365 Bytes\*\*\* © ATI Confidential. Reference Copyright Notice on Cover Page © \*\*\*

Formatted: Bullets and Numbering



ORIGINATE DATE

EDIT DATE

DOCUMENT-REV. NUM. GEN-CXXXXX-REVA

PAGE 25 of 51

24 September, 2001

4 September, 201511

PRED\_SETE\_# - similar to SETE except that the result is 'exported' to the sequencer.

PRED\_SETNE # - similar to SETNE except that the result is 'exported' to the sequencer.

PRED\_SETNE\_# - similar to SETNE except that the result is 'exported' to the sequencer. PRED\_SETGT\_# - similar to SETGT except that the result is 'exported' to the sequencer PRED\_SETGTE\_# - similar to SETGTE except that the result is 'exported' to the sequencer

For the scalar operations only we will also support the two following instructions:

PRED\_SETE0\_# - SETE0 PRED\_SETE1\_# - SETE1

The export is a single bit - 1 or 0 that is sent using the same data path as the MOVA instruction. The sequencer will maintain 4 sets of 64 bit predicate vectors (in fact 8 sets because we interleave two programs but only 4 will be exposed) and use it to control the write masking. This predicate is not maintained across clause boundaries. The # sign is used to specify which predicate set you want to use 0 thru 3.

Then we have two conditional execute bits. The first bit is a conditional execute "on" bit and the second bit tells us if we execute on 1 or 0. For example, the instruction:

P0 ADD #R0,R1,R2

Is only going to write the result of the ADD into those GPRs whose predicate bit is 0. Alternatively, P1\_ADD\_# would only write the results to the GPRs whose predicate bit is set. The use of the P0 or P1 without precharging the sequencer with a PRED instruction is undefined.

{Issue: do we have to have a NOP between PRED and the first instruction that uses a predicate?}

### 6.5 HW Detection of PV.PS

Because of the control program, the compiler cannot detect statically dependant instructions. In the case of non-masked writes and subsequent reads the sequencer will insert uses of PV,PS as needed. This will be done by comparing the read address and the write address of consecutive instructions. For masked writes, the sequencer will insert NOPs wherever there is a dependant read/write.

The sequencer will also have to insert NOPs between PRED\_SET and MOVA instructions and their uses.

# 6.6 Register file indexing

Because we can have loops in fetch clause, we need to be able to index into the register file in order to retrieve the data created in a fetch clause loop and use it into an ALU clause. The instruction will include the base address for register indexing and the instruction will contain these controls:

| Bit7 | Bit 6 |                     |
|------|-------|---------------------|
| 0    | 0     | 'absolute register' |
| 0    | 1     | 'relative register' |
| 1    | 0     | 'previous vector'   |
| 1    | 1     | 'previous scalar'   |

In the case of an absolute register we just take the address as is. In the case of a relative register read we take the base address and we add to it the loop\_index and this becomes our new address that we give to the shader pipe.

The sequencer is going to keep a loop index computed as such:

Index = Loop\_iterator\*Loop\_step + Loop\_start.

We loop until loop\_iterator = loop\_count. Loop\_step is a signed value [-128...127]. The computed index value is a 10 bit counter that is also signed. Its real range is [-256,256]. The tenth bit is only there so that we can provide an out of range value to the "indexing logic" so that it knows when the provided index is out of range and thus can make the necessary arrangements.



EDIT DATE 4 September, 201544 R400 Sequencer Specification

PAGE 26 of 51

6.7 Debugging the Shaders

In order to be able to debug the pixel/vertex shaders efficiently, we provide 2 methods.

#### 6.7.1 Method 1: Debugging registers

Current plans are to expose 2 debugging, or error notification, registers:

- 1. address register where the first error occurred
- 2. count of the number of errors

The sequencer will detect the following groups of errors:

- count overflow
- constant indexing overflow
- register indexing overflow

Compiler recognizable errors:

- jump errors
  - relative jump address > size of the control flow program
- call stack
  - call with stack full
  - return with stack empty

A jump error will always cause the program to break. In this case, a break means that a clause will halt execution, but allowing further clauses to be executed.

With all the other errors, program can continue to run, potentially to worst-case limits. The program will only break if the DB\_PROB\_BREAK register is set.

If indexing outside of the constant or the register range, causing an overflow error, the hardware is specified to return the value with an index of 0. This could be exploited to generate error tokens, by reserving and initializing the 0th register (or constant) for errors.

{ISSUE : Interrupt to the driver or not?}

#### 6.7.2 Method 2: Exporting the values in the GPRs

1) The sequencer will have a debug active, count register and an address register for this mode.

Under the normal mode execution follows the normal course.

Under the debug mode it is assumed that the program is always exporting n debug vectors and that all other exports to the SX block (position, color, z, ect) will been turned off (changed into NOPs) by the sequencer (even if they occur before the address stated by the ADDR debug register).

#### Pixel Kill Mask

A vector of 64 bits is kept by the sequencer per group of pixels/vertices. Its purpose is to optimize the texture fetch requests and allow the shader pipe to kill pixels using the following instructions:

MASK\_SETE MASK SETNE MASK\_SETGT MASK\_SETGTE

#### 8. Multipass vertex shaders (HOS)

Multipass vertex shaders are able to export from the 6 last clauses but to memory ONLY.



EDIT DATE 4 September, 201544 DOCUMENT-REV. NUM. GEN-CXXXXX-REVA PAGE 27 of 51

# 9. Register file allocation

The register file allocation for vertices and pixels can either be static or dynamic. In both cases, the register file in managed using two round robins (one for pixels and one for vertices). In the dynamic case the boundary between pixels and vertices is allowed to move, in the static case it is fixed to 128-VERTEX\_REG\_SIZE for vertices and PIXEL\_REG\_SIZE for pixels.



Above is an example of how the algorithm works. Vertices come in from top to bottom; pixels come in from bottom to top. Vertices are in orange and pixels in green. The blue line is the tail of the vertices and the green line is the tail of the pixels. Thus anything between the two lines is shared. When pixels meets vertices the line turns white and the boundary is static until both vertices and pixels share the same "unallocated bubble". Then the boundary is allowed to move again. The numbering of the GPRs starts from the bottom of the picture at index 0 and goes up to the top at index 127.

### 10. Fetch Arbitration

The fetch arbitration logic chooses one of the n potentially pending fetch clauses to be executed. The choice is made by looking at the Vs and Ps reservation stations and picking the first one ready to execute. Once chosen, the clause state machine will send one 2x2 fetch per clock (or 4 fetches in one clock every 4 clocks) until all the fetch instructions of the clause are sent. This means that there cannot be any dependencies between two fetches of the same clause.

The arbitrator will not wait for the fetches to return prior to selecting another clause for execution. The fetch pipe will be able to handle up to X(?) in flight fetches and thus there can be a fair number of active clauses waiting for their fetch return data.

#### 11. ALU Arbitration

ALU arbitration proceeds in almost the same way than fetch arbitration. The ALU arbitration logic chooses one of the n potentially pending ALU clauses to be executed. The choice is made by looking at the Vs and Ps reservation stations and picking the first one ready to execute. There are two ALU arbiters, one for the even clocks and one for the odd clocks. For example, here is the sequencing of two interleaved ALU clauses (E and O stands for Even and Odd sets of 4 clocks):

Einst0 Oinst0 Einst1 Oinst1 Einst2 Oinst2 Einst0 Oinst3 Einst1 Oinst4 Einst2 Oinst0...



EDIT DATE
4 September, 201511

DOCUMENT-REV. NUM. GEN-CXXXXX-REVA PAGE 29 of 51

Proceeding this way hides the latency of 8 clocks of the ALUs. Also note that the interleaving also occurs across clause boundaries.

# 12. Handling Stalls

When the output file is full, the sequencer prevents the ALU arbitration logic from selecting the last clause (this way nothing can exit the shader pipe until there is place in the output file. If the packet is a vertex packet and the position buffer is full (POS\_FULL) then the sequencer also prevents a thread from entering an exporting clause. The sequencer will set the OUT\_FILE\_FULL signal n clocks before the output file is actually full and thus the ALU arbiter will be able read this signal and act accordingly by not preventing exporting clauses to proceed.

# 13. Content of the reservation station FIFOs

The reservation FIFOs contain the state of the vector of pixels and vertices. We have two sets of those: one for pixels, and one for vertices. They contain 3 bits of Render State 7 bits for the base address of the GPRs, some bits for LOD correction and coverage mask information in order to fetch for only valid pixels, the quad address.

# 14. The Output File

The output file is where pixels are put before they go to the RBs. The write BW to this store is 256 bits/clock. Just before this output file are staging registers with write BW 512 bits/clock and read BW 256 bits/clock. The staging registers are 4x128 (and there are 16 of those on the whole chip).

# 15. IJ Format

The IJ information sent by the PA is of this format on a per quad basis:

We have a vector of IJ's (one IJ per pixel at the centroid of the fragment or at the center of the pixel depending on the mode bit). All pixel's parameters are always interpolated at full 20x24 mantissa precision.

$$P0 = A + I(0) * (B - A) + J(0) * (C - A)$$

$$P1 = A + I(1) * (B - A) + J(1) * (C - A)$$

$$P2 = A + I(2) * (B - A) + J(2) * (C - A)$$

$$P3 = A + I(3) * (B - A) + J(3) * (C - A)$$



Multiplies (Full Precision): 8 Subtracts 19x24 (Parameters): 2 Adds: 8

FORMAT OF P's IJ: N

Mantissa 20 Exp 4 for I + Sign Mantissa 20 Exp 4 for J + Sign

Total number of bits: 20\*8 + 4\*8 + 4\*2 = 200.

All numbers are kept using the un-normalized floating point convention: if exponent is different than 0 the number is normalized if not, then the number is un-normalized. The maximum range for the IJs (Full precision) is +/- 1024.

#### 15.1 Interpolation of constant attributes

Because of the floating point imprecision, we need to take special provisions if all the interpolated terms are the same or if two of the terms are the same.



ORIGINATE DATE

EDIT DATE
4 September, 201514

R400 Sequencer Specification

PAGE 30 of 51

24 September, 2001

# 16. Staging Registers

In order for the reuse of the vertices to be 14, the sequencer will have to re-order the data sent IN ORDER by the VGT for it to be aligned with the parameter cache memory arrangement. Given the following group of vertices sent by the VGT:

 $0\ 1\ 2\ 3\ 4\ 5\ 6\ 7\ 8\ 9\ 10\ 11\ 12\ 13\ 14\ 15\ ||\ 16\ 17\ 18\ 19\ 20\ 21\ 22\ 23\ 24\ 25\ 26\ 27\ 28\ 29\ 30\ 31\ ||\ 32\ 33\ 34\ 35\ 36\ 37\ 38\ 39\ 40\ 41\ 42\ 43\ 44\ 5\ 46\ 47\ ||\ 48\ 49\ 50\ 51\ 52\ 53\ 54\ 55\ 56\ 57\ 58\ 59\ 60\ 61\ 62\ 63$ 

The sequencer will re-arrange them in this fashion:

 $0\ 1\ 2\ 3\ 16\ 17\ 18\ 19\ 32\ 33\ 34\ 35\ 48\ 49\ 50\ 51\ ||\ 4\ 5\ 6\ 7\ 20\ 21\ 22\ 23\ 36\ 37\ 38\ 39\ 52\ 53\ 54\ 55\ ||\ 8\ 9\ 10\ 11\ 24\ 25\ 26\ 27\ 40\ 41\ 42\ 43\ 56\ 57\ 58\ 59\ ||\ 12\ 13\ 14\ 15\ 28\ 29\ 30\ 31\ 44\ 45\ 46\ 47\ 60\ 61\ 62\ 63$ 

The || markers show the SP divisions. In the event a shader pipe is broken, the SQ is responsible to insert padding to account for the missing pipe. For example, if SP1 is broken, vertices 4 5 6 7 20 21 22 23 36 37 38 39 52 53 54 55 will not be sent by the VGT to the SQ AND the SQ is responsible to "jump" over these vertices in order for no valid vertices to be sent to an invalid SP.

The most straightforward, non-compressed interface method would be to convert, in the VGT, the data to 32-bit floating point prior to transmission to the VSISRs. In this scenario, the data would be transmitted to (and stored in) the VSISRs in full 32-bit floating point. This method requires three 24-bit fixed-to-float converters in the VGT. Unfortunately, it also requires and additional 3,072 bits of storage across the VSISRs. This interface is illustrated in Figure 11Figure—11. The area of the fixed-to-float converters and the VSISRs for this method is roughly estimated as 0.759sqmm using the R300 process. The gate count estimate is shown in Figure 10Figure 10Figure—10.

| Basis for 8-deep Latch Memory (fron                                     | n R300)       | *************************************** |                               |
|-------------------------------------------------------------------------|---------------|-----------------------------------------|-------------------------------|
| 8x24-bit                                                                | 11631         | $\mu^2$                                 | $60.57813\mu^2\text{per bit}$ |
| Area of 96x8-deep Latch Memory<br>Area of 24-bit Fix-to-float Converter | 46524<br>4712 | $\mu^2$ $\mu^2$ per conve               | erter                         |
| Method 1                                                                | Block         | Quantity                                | Area                          |
|                                                                         | F2F           | 3                                       | 14136                         |
|                                                                         | 8x96 Latch    | 16_                                     | 744384                        |
|                                                                         |               |                                         | 758520 μ²                     |

Figure 10:Area Estimate for VGT to Shader Interface



EDIT DATE
4 September, 201544

DOCUMENT-REV. NUM. GEN-CXXXXX-REVA PAGE 31 of 51



Figure 11:VGT to Shader Interface

### 17. The parameter cache

The parameter cache is where the vertex shaders export their data. It consists of 16 128x128 memories (1R/1W). The reuse engine will make it so that all vertexes of a given primitive will hit different memories. The allocation method for these memories is a simple round robin. The parameter cache pointers are mapped in the following way: 4MSBs are the memory number and the 7 LSBs are the address within this memory.

| MEMORY NUMBER | ADDRESS |
|---------------|---------|
| 4 bits        | 7 bits  |

The PA generates the parameter cache addresses as the positions come from the SQ. All it needs to do is keep a Current\_Location pointer (7 bits only) and as the positions comes increment the memory number. When the memory number field wraps around, the PA increments the Current\_Location by VS\_EXPORT\_COUNT (a snooped register from the SQ). As an example, say the memories are all empty to begin with and the vertex shader is exporting 8 parameters per vertex (VS\_EXPORT\_COUNT = 8). The first position received is going to have the PC address 00000000000 the second one 00010000000, third one 00100000000 and so on up to 11110000000. Then the next position received (the 17<sup>th</sup>) is going to have the address 0000001000, the 18<sup>th</sup> 00010001000, the 19<sup>th</sup> 00100001000 and so on. The Current\_location is NEVER reset BUT on chip resets. The only thing to be careful about is that if the SX doesn't send you a full group of positions (<64) then you need to fill the address space so that the next group starts correctly aligned (for example if you receive only 33 positions then you need to add 2\*VS\_EXPORT\_COUNT to Current\_Location and reset the memory count to 0 before the next vector begins).



EDIT DATE
4 September, 201511

R400 Sequencer Specification

PAGE 32 of 51

### 17.1 Export restrictions

#### 17.1.1 Pixel exports:

Pixels can export 1,2,3 or 4 color buffers to the SX( +z). The exports will be done in order. The PRED\_OPTIMIZE function has to be turned of if the exports are done using interleaved predicated instructions. The exports will always be ordered to the SX.

#### 17.1.2 Vertex exports:

Position or parameter caches can be exported in any order in the shader program. It is always better to export posistion as soon as possible. Position has to be exported in a single export block (no texture instructions can be placed between the exports). Parameter cache exports can be done in any order with texture instructions interleaved. The PRED\_OPTIMIZE function has to be turned of if the exports are done using interleaved predicated instructions to the Parameter cache (see Arbitration restrictions for details). The exports will always be allocated in order to the SX.

#### 17.1.3 Pass thru exports:

Pass thru exports have to be done in groups of the form:

```
Alloc 4 (8 or 12)
Execute ALU(ADDR) ALU(DATA) ALU(DATA) ALU(DATA)...
```

They cannot have texture instructions interleaved in the export block. These exports are not guaranteed to be ordered.

Also, when doing a pass thru export, Position MUST be exported AFTER all pass thru exports. This position export is used to synchronize the chip when doing a transition from pass thru shader to regular shader and vice versa.

#### 17.2 Arbitration restrictions

Here are the Sequencer arbitration restrictions:

- 1) Cannot execute a serialized thread if the corresponding texture pending bit is set
- 2) Cannot allocate position if any older thread has not allocated position
- 3) If last thread is marked as not valid AND marked as last and we are about to execute the second to oldest thread also marked last then:
  - a. Both threads must be from the same context (cannot allow a first thread)
  - b. Must turn off the predicate optimization for the second thread
- 4) Cannot execute a texture clause if texture reads are pending
- 5) Cannot execute last if texture pending (even if not serial)

# 18. Export Types

The export type (or the location where the data should be put) is specified using the destination address field in the ALU instruction. Here is a list of all possible export modes:

### 18.1 Vertex Shading

0:15 - 16 parameter cache 16:31 - Empty (Reserved?) 32 - Export Address

33:41-37 - 9-5 vertex exports to the frame buffer and index

4238:47 - Empty

48:5525 - 8-5 debug export (interpret as normal vertex-memory export)

60 - export addressing mode

61 - Empty 62 - position

Exhibit 2034\_docR400\_Sequencer.doc 73365 Bytes\*\*\* © ATI Confidential. Reference Copyright Notice on Cover Page © \*\*\*

AMD1044\_0257691



ORIGINATE DATE

EDIT DATE
4 September, 201544

DOCUMENT-REV. NUM. GEN-CXXXXX-REVA PAGE 33 of 51

24 September, 2001

- sprite size export that goes with position export

(X= point size, Y= edge flag is bit 0, Z= VtxKill is bitwise OR of bits 30:0. Any bit other than

sign means VtxKill.)

### 18.2 Pixel Shading

```
0 - Color for buffer 0 (primary)
1 - Color for buffer 1
2 - Color for buffer 2
3 - Color for buffer 3
4:15 - Empty
16 - Buffer 0 Color/Fog (primary)
17 - Buffer 1 Color/Fog
18 - Buffer 2 Color/Fog
```

19 - Buffer 3 Color/Fog 20:31 - Empty

32 - Export Address

33:41<u>37</u> - 9-<u>5</u> exports for multipass pixel shaders.

42<u>38</u>:47 - Empty

48:5525 - 85 debug exports (interpret as normal pixel-memory export)

60 - export addressing mode

61 - Z for primary buffer (Z exported to 'alpha' component)

62:63 - Empty

# 19. Special Interpolation modes

### 19.1 Real time commands

We are unable to use the parameter memory since there is no way for a command stream to write into it. Instead we need to add three 16x128 memories (one for each of three vertices x 16 interpolants). These will be mapped onto the register bus and written by type 0 packets, and output to the the parameter busses (the sequencer and/or PA need to be able to address the reatime parameter memory as well as the regular parameter store. For higher performance we should be able to view them as two banks of 16 and do double buffering allowing one to be loaded, while the other is rasterized with. Most overlay shaders will need 2 or 4 scalar coordinates, one option might be to restrict the memory to 16x64 or 32x64 allowing only two interpolated scalars per cycle, the only problem I see with this is, if we view support for 16 vector-4 interpolants important (true only if we map Microsoft's high priority stream to the realtime stream), then the PA/sequencer need to support a realtime-specific mode where we need to address 32 vectors of parameters instead of 16. This mode is triggered by the primitive type: REAL TIME. The actual memories are in the in the SX blocks. The parameter data memories are hooked on the RBBM bus and are loaded by the CP using register mapped memory.

### 19.2 Sprites/ XY screen coordinates/ FB information

XY screen coordinates may be needed in the shader program. This functionality is controlled by the param\_gen\_10 register (in SQ) in conjunction with the SND\_XY register (in SC) and the param\_gen\_pos. Also it is possible to send the faceness information (for OGL front/back special operations) to the shader using the same control register. Here is a list of all the modes and how they interact together:

The Data is going to be written in the register specified by the param\_gen\_pos register.

```
Param_Gen_I0 disable, snd_xy disable = No modification
Param_Gen_I0 disable, snd_xy enable = No modification
Param_Gen_I0 enable, snd_xy disable = Sign(faceness)garbage,(Sign Point)garbage,Sign(Line)s, t
Param_Gen_I0 enable, snd_xy enable = Sign(faceness)screenX,(Sign Point)screenY,Sign(Line)s, t
```

In other words

The generated vector is (X in RED, Y in GREEN, S in BLUE and T in ALPHA): X,Y,S,T



}

ORIGINATE DATE

**EDIT DATE** 4 September, 201544

R400 Sequencer Specification

PAGE 34 of 51

24 September, 2001

These values are always supposed to be positive and any shader use of them should use the ABS function (as their sign bits will now be used for flags).

SignX = BackFacing

SignY = Point Primitive

SignS = Line Primitive

SignT = currently unused as a flag.

If !Point & !Line, then it is a Poly.

I would assume that one implementation which allows for generic texture lookup (using 3D maps) for poly stipple and AA for the driver would be

if(Y<0) { R = 0.0 (Point) } else if (S < 0) { R = 1.0 (Line) } else { R = 2.0 (Poly)

# 19.3 Auto generated counters

In the cases we are dealing with multipass shaders, the sequencer is going to generate a vector count to be able to both use this count to write the 1st pass data to memory and then use the count to retrieve the data on the 2nd pass. The count is always generated in the same way but it is passed to the shader in a slightly different way depending on the shader type (pixel or vertex). This is toggled on and off using the GEN\_INDEX\_PIX/VTX register. The sequencer is going to keep two counters, one for pixels and one for vertices. Every time a full vector of vertices or pixels is written to the GPRs the counter is incremented. Every time a RST\_PIX\_COUNT or RST\_VTX\_COUNT events are received, the corresponding counter is reset. While there is only one count broadcast to the GPRs, the LSB are hardwired to specific values making the index different for all elements in the vector. Since the count must be different for all pixels/vertices and the 4 LSBs (16 positions) are hardwired to the corresponding shader unit the SQ has two

1) Maintain a 19 bit counter that counts the vectors of 64. In this case the phase must be appended to the count before the count is broadcast to the SPs:

Counter (19 bits) Phase (2 bits) Hardwired (4 bits)

2) Maintain a 21 bits counter that counts sub-vectors of 16. In this case only the counter is sent to the Sps:

Counter (21 bits) HarwiredHardwired (4 bits)

#### 19.3.1 Vertex shaders

In the case of vertex shaders, if GEN\_INDEX\_VTX is set, the data will be put into the x field of the third register (it means that the compiler must allocate 3 GPRs in all multipass vertex shader modes).

#### 19.3.2 Pixel shaders

In the case of pixel shaders, if GEN\_INDEX\_PIX is set, the data will be put in the x field of the param\_gen\_pos+1 register.

Exhibit 2034.docR400\_Sequencer.doc 73365 Bytes\*\*\* ® ATI Confidential. Reference Copyright Notice on Cover Page ® \*\*\*

AMD1044\_0257693



Figure 12: GPR input mux Control

# 20. State management

Every clock, the sequencer will report to the CP the oldest states still in the pipe. These are the states of the programs as they enter the last ALU clause.

# 20.1 Parameter cache synchronization

In order for the sequencer not to begin a group of pixels before the associated group of vertices has finished, the sequencer will keep a 6 bit count per state (for a total of 8 counters). These counters are initialized to 0 and every time a vertex shader exports its data TO THE PARAMETER CACHE, the corresponding pointer is incremented. When the SC sends a new vector of pixels with the SC\_SQ\_new\_vector bit asserted, the sequencer will first check if the count is greater than 0 before accepting the transmission (it will in fact accept the transmission but then lower its ready to receive). Then the sequencer waits for the count to go to one and decrements it. The sequencer can then issue the group of pixels to the interpolators. Every time the state changes, the new state counter is initialized to 0.

#### 21. XY Address imports

The SC will be able to send the XY addresses to the GPRs. It does so by interleaving the writes of the IJs (to the IJ buffer) with XY writes (to the XY buffer). Then when writing the data to the GPRs, the sequencer is going to interpolate the IJ data or pass the XY data thru a Fix—float converter and expander and write the converted values to the GPRs. The Xys are currently SCREEN SPACE COORDINATES. The values in the XY buffers will wrap. See section 19.2 for details on how to control the interpolation in this mode.

### 21.1 Vertex indexes imports

In order to import vertex indexes, we have 16 8x96 staging registers. These are loaded one line at a time by the VGT block (96 bits). They are loaded in floating point format and can be transferred in 4 or 8 clocks to the GPRs.

#### 22. Registers

Please see the auto-generated web pages for register definitions.



EDIT DATE
4 September, 201514

R400 Sequencer Specification

PAGE 36 of 51

### 23. Interfaces

#### 23.1 External Interfaces

Whenever an x is used, it means that the bus is broadcast to all units of the same name. For example, if a bus is named  $SQ \rightarrow SPx$  it means that SQ is going to broadcast the same information to all SP instances.

### 23.2 SC to SP Interfaces

### 23.2.1 SC\_SP#

There is one of these interfaces at front of each of the SP (buffer to stage pixel interpolators). This interface transmits the I,J data for pixel interpolation. For the entire system, two quads per clock are transferred to the 4 SPs, so each of these 4 interfaces transmits one half of a quad per clock. The interface below describes a half of a quad worth of data.

The actual data which is transferred per quad is Ref Pix I => S4.20 Floating Point I value \*4 Ref Pix J => S4.20 Floating Point J value \*4

This equates to a total of 200 bits which transferred over 2 clocks and therefor needs an interface 100 bits wide

Additionally, X,Y data (12-bit unsigned fixed) is conditionally sent across this data bus over the same wires in an additional clock. The X,Y data is sent on the lower 24 bits of the data bus with faceness in the msb. Transfers across these interfaces are synchronized with the SC\_SQ IJ Control Bus transfers.

The data transfer across each of these busses is controlled by a IJ\_BUF\_INUSE\_COUNT in the SC. Each time the SC has sent a pixel vector's worth of data to the SPs, he will increment the IJ\_BUF\_INUSE\_COUNT count. Prior to sending the next pixel vectors data, he will check to make sure the count is less than MAX\_BUFER\_MINUS\_2, if not the SC will stall until the SQ returns a pipelined pulse to decrement the count when he has scheduled a buffer free. Note: We could/may optimize for the case of only sending only IJ to use all the buffers to pre-load more. Currently it is planned for the SP to hold 2 double buffers of I,J data and two buffers of X,Y data, so if either X,Y or Centers and Centroids are on, then the SC can send two Buffers.

In at least the initial version, the SC shall send 16 quads per pixel vector even if the vector is not full. This will increment buffer write address pointers correctly all the time. (We may revisit this for both the SX,SP,SQ and add a EndOF/ector signal on all interfaces to quit early. We opted for the simple mode first with a belief that only the end of packet and multiple new vector signals should cause a partial vector and that this would not really be significant performance hit.)

| Name                  | Bits | Description                                                                      |  |  |  |
|-----------------------|------|----------------------------------------------------------------------------------|--|--|--|
| SC_SP#_data           | 100  | IJ information sent over 2 clocks (or X,Y in 24 LSBs with faceness in upper bit) |  |  |  |
|                       |      | Type 0 or 1, First clock I, second clk J                                         |  |  |  |
|                       |      | Field ULC URC LLC LRC                                                            |  |  |  |
|                       |      | Bits [63:39] [38:26] [25:13] [12:0]                                              |  |  |  |
|                       |      | Format SE4M20 SE4M20 SE4M20                                                      |  |  |  |
|                       |      | Type 2                                                                           |  |  |  |
|                       |      | Field Face X Y                                                                   |  |  |  |
|                       |      | Bits [24] [23:12] [11:0]                                                         |  |  |  |
|                       |      | Format Bit Unsigned Unsigned                                                     |  |  |  |
| SC_SP#_valid          | 1    | Valid                                                                            |  |  |  |
| SC_SP#_last_quad_data | 1    | This bit will be set on the last transfer of data per quad.                      |  |  |  |
| SC_SP#_type           | 2    | 0 -> Indicates centroids                                                         |  |  |  |
|                       |      | 1 -> Indicates centers                                                           |  |  |  |
|                       |      | 2 -> Indicates X,Y Data and faceness on data bus                                 |  |  |  |
|                       |      | The SC shall look at state data to determine how many types to send for the      |  |  |  |



EDIT DATE
4 September, 201511

DOCUMENT-REV. NUM. GEN-CXXXXX-REVA

PAGE 37 of 51

interpolation process.

The # is included for clarity in the spec and will be replaced with a prefix of u#\_ in the verilog module statement for the SC and the SP block will have neither because the instantiation will insert the prefix.

# 23.2.2 SC\_SQ

This is the control information sent to the sequencer in order to synchronize and control the interpolation and/or loading data into the GPRs needed to execute a shader program on the sent pixels. This data will be sent over two clocks per transfer with 1 to 16 transfers. Therefore the bus (approx 108 bits) could be folded in half to approx 54 bits

| Name        | Bits | Description                                                           |
|-------------|------|-----------------------------------------------------------------------|
| SC_SQ_data  | 46   | Control Data sent to the SQ                                           |
|             |      | 1 clk transfers                                                       |
|             |      | Event – valid data consist of event_id and                            |
|             |      | state_id. Instruct SQ to post an                                      |
|             |      | event vector to send state id and                                     |
|             |      | event_id through request fifo                                         |
|             |      | and onto the reservation stations                                     |
|             |      | making sure state id and/or event_id                                  |
|             |      | gets back to the CP. Events only                                      |
|             |      | follow end of packets so no pixel                                     |
|             |      | vectors will be in progress.                                          |
|             |      | Empty Quad Mask – Transfer Control data                               |
|             |      | consisting of pc_dealloc                                              |
|             |      | or new_vector. Receipt of this is to                                  |
|             |      | transfer pc_dealloc or new_vector                                     |
|             |      | without any valid quad data. New                                      |
|             |      | vector will always be posted to                                       |
|             |      | request fifo and pc_dealloc will be                                   |
|             |      | attached to any pixel vector<br>outstanding or posted in request fifo |
|             |      | if no valid guad outstanding.                                         |
|             |      | 2 clk transfers                                                       |
|             |      | Quad Data Valid – Sending quad data with or                           |
|             |      | without new_vector or pc_dealloc.                                     |
|             |      | New vector will be posted to request                                  |
|             |      | fifo with or without a pixel vector and                               |
|             |      | pc_dealloc will be posted with a pixel                                |
|             |      | vector unless none is in progress. In                                 |
|             |      | this case the pc_dealloc will be                                      |
|             |      | posted in the request queue.                                          |
|             |      | Filler quads will be transferred with                                 |
|             |      | The Quad mask set but the pixel                                       |
|             |      | corresponding pixel mask set to zero.                                 |
|             |      | 2610.                                                                 |
| SC_SQ_valid | 1    | SC sending valid data, 2 <sup>nd</sup> clk could be all zeroes        |

SC\_SQ\_data - first clock and second clock transfers are shown in the table below.

| Name                           | BitField | Bits | Description                                                                       |
|--------------------------------|----------|------|-----------------------------------------------------------------------------------|
| 1 <sup>st</sup> Clock Transfer |          |      |                                                                                   |
| SC_SQ_event                    | 0        | 1    | This transfer is a 1 clock event vector Force quad_mask = new_vector=pc_dealloc=0 |
| SC_SQ_event_id                 | [45:1]   | 4    | This field identifies the event 0 => denotes an End Of State Event                |

|                         | ORIGINATE I            | DATE                        | FI                                                                                                                         | DIT DATE                                          | R400 Sequencer Specification           | PAGE     |  |
|-------------------------|------------------------|-----------------------------|----------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------|----------------------------------------|----------|--|
|                         | ONGIVALE               | September, 2001 4 September |                                                                                                                            | DITUALL                                           | R400 Sequencer Specification           | FAGL     |  |
|                         | 24 September           |                             |                                                                                                                            | ember, 201544                                     |                                        | 38 of 51 |  |
|                         |                        |                             | T                                                                                                                          | => TBD                                            |                                        |          |  |
| SC SQ state             | id                     | [78:65]                     | 3                                                                                                                          |                                                   | State/constant pointer (6*3+3)         |          |  |
| SC SQ pc de             | <del></del>            | [4011:8                     | 3                                                                                                                          | ļ                                                 | ken for the Parameter Cache            |          |  |
|                         |                        | 9]                          |                                                                                                                            |                                                   |                                        |          |  |
| SC_SQ_new_              | vector                 | 1112                        | 1                                                                                                                          |                                                   | ait for Vertex shader done count > 0 a |          |  |
|                         |                        |                             |                                                                                                                            |                                                   | Pixel Vector the SQ will decrement th  | e count. |  |
| SC_SQ_quad              | _mask                  | [45 <u>16</u> :4<br>213]    | 4                                                                                                                          | Quad Write ma                                     | sk left to right SP0 => SP3            |          |  |
| SC_SQ_end_              | of_prim                | 1617                        | 1                                                                                                                          | End Of the prim                                   | iitive                                 |          |  |
| SC_SQ_pix_n             |                        | [32 <u>33</u> :4<br>718]    | 16                                                                                                                         | Valid bits for all                                | pixels SP0=>SP3 (UL,UR,LL,LR)          |          |  |
| SC_SQ_provo             | SC_SQ_provok_vtx       |                             | 2                                                                                                                          | Provoking vertex for flat shading                 |                                        |          |  |
| SC_SQ_lod_c             | SC_SQ_lod_correct_0    |                             | 9                                                                                                                          | LOD correction for quad 0 (SP0) (9 bits per quad) |                                        |          |  |
| SC_SQ_lod_c             | SC_SQ_lod_correct_1 [5 |                             | 9                                                                                                                          | LOD correction for quad 1 (SP1) (9 bits per quad) |                                        |          |  |
| 2nd Clock Tra           | nsfer                  |                             |                                                                                                                            |                                                   |                                        |          |  |
| SC_SQ_lod_c             | orrect_2               | [8:0]                       | 9                                                                                                                          | LOD correction                                    | for quad 2 (SP2) (9 bits per quad)     |          |  |
| SC_SQ_lod_c             | orrect_3               | [17:9]                      | 9                                                                                                                          | LOD correction for quad 3 (SP3) (9 bits per quad) |                                        |          |  |
| SC_SQ_pc_pt             | tr0                    | [28:18]                     | 11                                                                                                                         | Parameter Cac                                     | he pointer for vertex 0                |          |  |
| SC_SQ_pc_pt             | tr1                    | [39:29]                     | 11                                                                                                                         | Parameter Cac                                     | ne pointer for vertex 1                |          |  |
| SC_SQ_pc_pt             | tr2                    | [50:40]                     | 11                                                                                                                         | Parameter Cac                                     | he pointer for vertex 2                |          |  |
| SC_SQ_prim_type [53:51] |                        | 3                           | Stippled line an alternate buffer 000: Sprite (poi 001: Line 010: Tri_rect 100: Realtime S 101: Realtime L 110: Realtime I | nt)<br>Sprite (point)<br>.ine                     | x cords from                           |          |  |

| Name               | Bits | Description                                                                                                                                                 |
|--------------------|------|-------------------------------------------------------------------------------------------------------------------------------------------------------------|
| SQ_SC_free_buff    | 1    | Pipelined bit that instructs SC to decrement count of buffers in use.                                                                                       |
| SQ_SC_dec_cntr_cnt | 1    | Pipelined bit that instructs SC to decrement count of new vector and/or event sent to prevent SC from overflowing SQ interpolator/Reservation request fifo. |

The scan converter will submit a partial vector whenever:

- 1.) He gets a primitive marked with an end of packet signal.
- 2.) A current pixel vector is being assembled with at least one or more valid quads and the vector has been marked for deallocate when a primitive marked new\_vector arrives. The Scan Converter will submit a partial vector (up to 16quads with zero pixel mask to fill out the vector) prior to submitting the new\_vector marker\primitive.

(This will prevent a hang which can be demonstrated when all primitives in a packet three vectors are culled except for a one quad primitive that gets marked pc\_dealloc (vertices maximum size). In this case two new\_vectors are submitted and processed, but then one valid quad with the pc\_dealloc creates a vector and then the new would wait for another vertex vector to be processed, but the one being waited for could never export until the pc\_dealloc signal made it through and thus the hang.)



EDIT DATE
4 September, 201511

DOCUMENT-REV. NUM.
GEN-CXXXXX-REVA

PAGE 39 of 51

### 23.2.3 SQ to SX(SP): Interpolator bus

| Name                       | Direction | Bits | Description                                             |
|----------------------------|-----------|------|---------------------------------------------------------|
| SQ_SPx_interp_flat_vtx     | SQ→SPx    | 2    | Provoking vertex for flat shading                       |
| SQ_SPx_interp_flat_gouraud | SQ→SPx    | 1    | Flat or gouraud shading                                 |
| SQ_SPx_interp_cyl_wrap     | SQ→SPx    | 4    | Wich channel needs to be cylindrical wrapped            |
| SQ_SPx_interp_param_gen    | SQ→SPx    | 1    | Generate Parameter                                      |
| SQ_SPx_interp_prim_type    | SQ→SPx    | 2    | Bits [1:0] of primitive type sent by SC                 |
| SQ_SPx_interp_buff_swap    | SQ→SPx    | 1    | Swapp IJ buffers                                        |
| SQ_SPx_interp_IJ_line      | SQ→SPx    | 2    | IJ line number                                          |
| SQ_SPx_interp_mode         | SQ→SPx    | 1    | Center/Centroid sampling                                |
| SQ_SXx_pc_ptr0             | SQ→SXx    | 11   | Parameter Cache Pointer                                 |
| SQ_SXx_pc_ptr1             | SQ→SXx    | 11   | Parameter Cache Pointer                                 |
| SQ_SXx_pc_ptr2             | SQ→SXx    | 11   | Parameter Cache Pointer                                 |
| SQ_SXx_rt_sel              | SQ→SXx    | 1    | Selects between RT and Normal data (Bit 2 of prim type) |
| SQ_SX0_pc_wr_en            | SQ→SX0    | 8    | Write enable for the PC memories                        |
| SQ_SX1_pc_wr_en            | SQ→SX1    | 8    | Write enable for the PC memories                        |
| SQ_SXx_pc_wr_addr          | SQ→SXx    | 7    | Write address for the PCs                               |
| SQ_SXx_pc_channel_mask     | SQ→SXx    | 4    | Channel mask                                            |
| SQ_SXx_pc_ptr_valid        | SQ→SXx    | 1    | Read pointers are valid.                                |
| SQ_SPx_interp_valid        | SQ→SPx    | 1    | Interpolation control valid                             |

# 23.2.4 SQ to SP: Staging Register Data

This is a broadcast bus that sends the VSISR information to the staging registers of the shader pipes.

| Name               | Direction | Bits | Description                                            |
|--------------------|-----------|------|--------------------------------------------------------|
| SQ_SPx_vsr_data    | SQ→SPx    | 96   | Pointers of indexes or HOS surface information         |
| SQ_SPx_vsr_double  | SQ→SPx    | 1    | 0: Normal 96 bits per vert 1: double 192 bits per vert |
| SQ_SP0_ vsr_valid  | SQ→SP0    | 1    | Data is valid                                          |
| SQ_SP1_ vsr_ valid | SQ→SP1    | 1    | Data is valid                                          |
| SQ_SP2_vsr_valid   | SQ→SP2    | 1    | Data is valid                                          |
| SQ_SP3_ vsr_ valid | SQ→SP3    | 1    | Data is valid                                          |
| SQ_SPx_vsr_read    | SQ→SPx    | 1    | Increment the read pointers                            |

### 23.2.5 VGT to SQ: Vertex interface

### 23.2.5.1 Interface Signal Table

The area difference between the two methods is not sufficient to warrant complicating the interface or the state requirements of the VSISRs. Therefore, the POR for this interface is that the VGT will transmit the data to the VSISRs (via the Shader Sequencer) in full, 32-bit floating-point format. The VGT can transmit up to six 32-bit floating-point values to each VSISR where four or more values require two transmission clocks. The data bus is 96 bits wide. In the case where an event is sent the 5 LSBs of VGT SQ vsisr data contain the eventID.

| Name                   | Bits | Description                                                                                                                           |  |  |  |  |
|------------------------|------|---------------------------------------------------------------------------------------------------------------------------------------|--|--|--|--|
| VGT_SQ_vsisr_data      | 96   | Pointers of indexes or HOS surface information                                                                                        |  |  |  |  |
| VGT_SQ_event           | 1    | VGT is sending an event                                                                                                               |  |  |  |  |
| VGT_SQ_vsisr_continued | 1    | 0: Normal 96 bits per vert 1: double 192 bits per vert                                                                                |  |  |  |  |
| VGT_SQ_end_of_vtx_vect | 1    | Indicates the last VSISR data set for the current process vector (for double vector data, "end of vector" is set on the first vector) |  |  |  |  |
| VGT_SQ_indx_valid      | 1    | Vsisr data is valid                                                                                                                   |  |  |  |  |
| VGT_SQ_state           | 3    | Render State (6*3+3 for constants). This signal is guaranteed to be correct when "VGT_SQ_vgt_end_of_vector" is high.                  |  |  |  |  |
| VGT_SQ_send            | 1    | Data on the VGT_SQ is valid receive (see write-up for standard R400 SEND/RTR interface handshaking)                                   |  |  |  |  |
| SQ_VGT_rtr             | 1    | Ready to receive (see write-up for standard R400 SEND/RTR interface handshaking)                                                      |  |  |  |  |

#### 23.2.5.2 Interface Diagrams



AMD1044\_0257699



Figure 1. Detailed Logical Diagram for PA\_SQ\_vgt Interface.



EDIT DATE
4 September, 201544

R400 Sequencer Specification

PAGE 42 of 51

# 23.2.6 SQ to SX: Control bus

| Name               | Direction | Bits | Description                                                                                                                                                                                                                                                        |
|--------------------|-----------|------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| SQ_SXx_exp_type    | SQ→SXx    | 2    | 00: Pixel without z (1 to 4 buffers) 01: Pixel with z (1 to 4 buffers) 10: Position (1 or 2 results) 11: Pass thru (4,8 or 12 results aligned)                                                                                                                     |
| SQ_SXx_exp_number  | SQ→SXx    | 2    | Number of locations needed in the export buffer (encoding depends on the type see bellow).                                                                                                                                                                         |
| SQ_SXx_exp_alu_id  | SQ→SXx    | 1    | ALU ID                                                                                                                                                                                                                                                             |
| SQ_SXx_exp_valid   | SQ→SXx    | 1    | Valid bit                                                                                                                                                                                                                                                          |
| SQ_SXx_exp_state   | SQ→SXx    | 3    | State Context                                                                                                                                                                                                                                                      |
| SQ_SXx_free_done   | SQ→SXx    | 1    | Pulse that indicates that the previous export is finished from the point of view of the SP. This does not necessarily mean that the data has been transferred to RB or PA, or that the space in export buffer for that particular vector thread has been freed up. |
| SQ_SXx_free_alu_id | SQ→SXx    | 1    | ALU ID                                                                                                                                                                                                                                                             |

Depending on the type the number of export location changes:

- Type 00 : Pixels without Z
  - o 00 = 1 buffer
  - o 01 = 2 buffers
  - o 10 = 3 buffers
  - o 11 = 4 buffer
- Type 01: Pixels with Z
  - 00 = 2 Buffers (color + Z)
  - o 01 = 3 buffers (2 color + Z)
  - o 10 = 4 buffers (3 color + Z)
  - o 11 = 5 buffers (4 color + Z)
- Type 10 : Position export
  - 00 = 1 position
     01 = 2 positions
  - o 1X = Undefined
- Type 11: Pass Thru
  - o 00 = 4 buffers
  - o 01 = 8 buffers
  - 10 = 12 buffers
     11 = Undefined

Below the thick black line is the end of transfer packet that tells the SX that a given export is finished. The report packet will always arrive either before or at the same time than the next export to the same ALU id.

#### 23.2.7 SX to SQ: Output file control

| Name                 | Direction | Bits | Description                                                                                                                 |
|----------------------|-----------|------|-----------------------------------------------------------------------------------------------------------------------------|
| SXx_SQ_exp_count_rdy | SXx→SQ    | 1    | Raised by SX0 to indicate that the following two fields reflect the result of the most recent export                        |
| SXx_SQ_exp_pos_avail | SXx→SQ    | 2    | Specifies whether there is room for another position. 00:0 buffers ready 01:1 buffer ready 10:2 or more buffers ready       |
| SXx_SQ_exp_buf_avail | SXx→SQ    | 7    | Specifies the space available in the output buffers.  0: buffers are full  1: 2K-bits available (32-bits for each of the 64 |

| A P  | ORIGINATE DATE     | EDIT                | Γ DATE                          | DOCUMENT-REV. NUM.                                                             | PAGE       |
|------|--------------------|---------------------|---------------------------------|--------------------------------------------------------------------------------|------------|
| 7700 | 24 September, 2001 | 4 September, 201514 |                                 | GEN-CXXXXX-REVA                                                                | 43 of 51   |
|      |                    |                     | pixels<br><br>64: 12<br>64 pixe | in a clause)<br>8K-bits available (16 128-bit entries f<br>els)<br>7: RESERVED | or each of |

# 23.2.8 SQ to TP: Control bus

Once every clock, the fetch unit sends to the sequencer on which RS line it is now working and if the data in the GPRs is ready or not. This way the sequencer can update the fetch valid bits flags for the reservation station. The sequencer also provides the instruction and constants for the fetch to execute and the address in the register file where to write the fetch return data.

| Name                   | Direction | Bits | Description                                               |
|------------------------|-----------|------|-----------------------------------------------------------|
| TPx_SQ_data_rdy        | TPx→ SQ   | 1    | Data ready                                                |
| TPx_SQ_rs_line_num     | TPx→ SQ   | 6    | Line number in the Reservation station                    |
| TPx_SQ_type            | TPx→ SQ   | 1    | Type of data sent (0:PIXEL, 1:VERTEX)                     |
| SQ_TPx_send            | SQ→TPx    | 1    | Sending valid data                                        |
| SQ_TPx_const           | SQ→TPx    | 48   | Fetch state sent over 4 clocks (192 bits total)           |
| SQ_TPx_instr           | SQ→TPx    | 24   | Fetch instruction sent over 4 clocks                      |
| SQ_TPx_end_of_group    | SQ→TPx    | 1    | Last instruction of the group                             |
| SQ_TPx_Type            | SQ→TPx    | 1    | Type of data sent (0:PIXEL, 1:VERTEX)                     |
| SQ_TPx_gpr_phase       | SQ→TPx    | 2    | Write phase signal                                        |
| SQ_TP0_lod_correct     | SQ→TP0    | 6    | LOD correct 3 bits per comp 2 components per quad         |
| SQ_TP0_pix_mask        | SQ→TP0    | 4    | Pixel mask 1 bit per pixel                                |
| SQ_TP1_lod_correct     | SQ→TP1    | 6    | LOD correct 3 bits per comp 2 components per quad         |
| SQ_TP1_pix_mask        | SQ→TP1    | 4    | Pixel mask 1 bit per pixel                                |
| SQ_TP2_lod_correct     | SQ→TP2    | 6    | LOD correct 3 bits per comp 2 components per quad         |
| SQ_TP2_pix_mask        | SQ→TP2    | 4    | Pixel mask 1 bit per pixel                                |
| SQ_TP3_lod_correct     | SQ→TP3    | 6    | LOD correct 3 bits per comp 2 components per quad         |
| SQ_TP3_pix_mask        | SQ→TP3    | 4    | Pixel mask 1 bit per pixel                                |
| SQ_TPx_rs_line_num     | SQ→TPx    | 6    | Line number in the Reservation station                    |
| SQ_TPx_write_gpr_index | SQ->TPx   | 7    | Index into Register file for write of returned Fetch Data |
| SQ TPx ctx id          | SQ→TPx    | 3    | The state context ID (needed for multisample resolves)    |

# 23.2.9 TP to SQ: Texture stall

The TP sends this signal to the SQ and the SPs when its input buffer is full.





EDIT DATE
4 September, 201544

R400 Sequencer Specification

PAGE 44 of 51

| Name              | Direction | Bits | Description                                  |
|-------------------|-----------|------|----------------------------------------------|
| TP_SQ_fetch_stall | TP→ SQ    | 1    | Do not send more texture request if asserted |

# 23.2.10 SQ to SP: Texture stall

| Name               | Direction | Bits | Description                                  |
|--------------------|-----------|------|----------------------------------------------|
| SQ_SPx_fetch_stall | SQ→SPx    | 1    | Do not send more texture request if asserted |

### 23.2.11 SQ to SP: GPR and auto counter

| Name                 | Direction | Bits | Description                                                                                                                      |
|----------------------|-----------|------|----------------------------------------------------------------------------------------------------------------------------------|
| SQ_SPx_gpr_wr_addr   | SQ→SPx    | 7    | Write address                                                                                                                    |
| SQ_SPx_gpr_rd_addr   | SQ→SPx    | 7    | Read address                                                                                                                     |
| SQ_SPx_gpr_rd_en     | SQ→SPx    | 1    | Read Enable                                                                                                                      |
| SQ_SP0_gpr_wr_en     | SQ→SPx    | 4    | Write Enable for the GPRs of SP0                                                                                                 |
| SQ_SP1_gpr_wr_en     | SQ→SPx    | 4    | Write Enable for the GPRs of SP1                                                                                                 |
| SQ_SP2_gpr_wr_en     | SQ→SPx    | 4    | Write Enable for the GPRs of SP2                                                                                                 |
| SQ_SP3_gpr_wr_en     | SQ→SPx    | 4    | Write Enable for the GPRs of SP3                                                                                                 |
| SQ_SPx_gpr_phase     | SQ→SPx    | 2    | The phase mux (arbitrates between inputs, ALU SRC reads and writes)                                                              |
| SQ_SPx_channel_mask  | SQ→SPx    | 4    | The channel mask                                                                                                                 |
| SQ_SPx_gpr_input_sel | SQ→SPx    | 2    | When the phase mux selects the inputs this tells from which source to read from: Interpolated data, VTX0, VTX1, autogen counter. |
| SQ_SPx_auto_count    | SQ→SPx    | 21   | Auto count generated by the SQ, common for all shader pipes                                                                      |



EDIT DATE
4 September, 201544

DOCUMENT-REV. NUM. GEN-CXXXXX-REVA PAGE 45 of 51

23.2.12 SQ to SPx: Instructions

| Name                 | Direction | Bits | Description                                                |
|----------------------|-----------|------|------------------------------------------------------------|
| SQ_SPx_instr_start   | SQ→SPx    | 1    | Instruction start                                          |
| SQ_SP_instr          | SQ→SPx    | 24   | Transferred over 4 cycles                                  |
|                      |           |      | 0: SRC A Negate Argument Modifier 0:0                      |
|                      |           |      | SRC A Abs Argument Modifier 1:1                            |
|                      |           |      | SRC A Swizzle 9:2                                          |
|                      |           |      |                                                            |
|                      |           |      |                                                            |
|                      |           |      | Per channel Select 23:16                                   |
|                      |           |      | 00: GPR                                                    |
|                      |           |      | 01: PV                                                     |
|                      |           |      | 10: PS                                                     |
|                      |           |      | 11: Constant (if 11 has to be 11 for a                     |
|                      |           |      | channels)                                                  |
|                      |           |      | -                                                          |
|                      |           |      | 1: SRC B Negate Argument Modifier 0:0                      |
|                      |           |      | SRC B Abs Argument Modifier 1:1                            |
|                      |           |      | SRC B Swizzle 9:2                                          |
|                      |           |      | Scalar Dst 15:10                                           |
|                      |           |      | Per channel Select 23:16                                   |
|                      |           |      | 00: GPR                                                    |
|                      |           |      | 01: PV                                                     |
|                      |           |      |                                                            |
|                      |           |      | 10: PS                                                     |
|                      |           |      | 11: Constant (if 11 has to be 11 for a                     |
|                      |           |      | channels)                                                  |
|                      |           |      |                                                            |
|                      |           |      | 2: SRC C Negate Argument Modifier 0:0                      |
|                      |           |      | SRC C Abs Argument Modifier 1:1                            |
|                      |           |      | SRC C Swizzle 9:2                                          |
|                      |           |      | Unused 15:10                                               |
|                      |           |      | Per channel Select 23:16                                   |
|                      |           |      | 00: GPR                                                    |
|                      |           |      | 01: PV                                                     |
|                      |           |      | 10: PS                                                     |
|                      |           |      |                                                            |
|                      |           |      | 11: Constant (if 11 has to be 11 for a channels)           |
|                      |           |      | -                                                          |
|                      |           |      | 3: Vector Opcode 4:0                                       |
|                      |           |      | Scalar Opcode 10:5                                         |
|                      |           |      | Vector Clamp 11:11                                         |
|                      |           |      | Scalar Clamp 12:12                                         |
|                      |           |      | Vector Write Mask 16:13                                    |
|                      |           |      | Scalar Write Mask 20:17                                    |
|                      |           |      |                                                            |
| CO CD01              | 60 000    |      |                                                            |
| SQ_SP0_pred_override | SQ→SP0    | 4    | 0: Use per channel RGBA field (enables the per             |
|                      |           |      | channel logic, if not set only pay attention to the 1      |
|                      |           |      | seting).                                                   |
|                      |           |      | 1: Use GPR                                                 |
| SQ SP1 pred override | SQ→SP1    | 4    | 0: Use per channel RGBA field (enables the pe              |
|                      |           |      | channel logic, if not set only pay attention to the 1      |
|                      |           |      | seting).                                                   |
|                      |           |      | 1: Use GPR                                                 |
|                      |           |      |                                                            |
| SQ_SP2_pred_override | SQ→SP2    | 4    | 0: Use per channel RGBA field (enables the per             |
|                      |           |      | channel logic, if not set only pay attention to the 1      |
|                      |           |      | seting).                                                   |
|                      |           | 1    |                                                            |
|                      |           |      | 1: Use GPR                                                 |
| SQ_SP3_pred_override | SQ→SP3    | 4    | 1: Use GPR  0: Use per channel RGBA field (enables the per |

|   | Ti       | ORIGINATE DAT<br>24 September, 20 |                                         | EDIT DA    | 201544 | R400 Sequencer Specification     | PAGE<br>46 of 51 |
|---|----------|-----------------------------------|-----------------------------------------|------------|--------|----------------------------------|------------------|
| 1 |          |                                   | *************************************** | October 20 |        | seting).<br>1: Use GPR           |                  |
|   | SQ_SPx_  | exp_id                            | SQ-                                     | →SPx       | 1      | GPR ID                           |                  |
|   | SQ_SPx_e | exporting                         | SQ-                                     | →SPx       |        | 0: Not Exporting<br>1: Exporting |                  |
|   | SQ SPx s | tall                              | SQ-                                     | →SPx       | 1      | Stall signal                     |                  |

# 23.2.13 SQ to SX: write mask interface (must be aligned with the SP data)

| Name               | Direction | Bits | Description                                                                                                                                                                                                  |
|--------------------|-----------|------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| SQ_SX0_write_mask  | SQ→SP0    | 8    | Result of pixel kill in the shader pipe, which must be output for all pixel exports (depth and all color buffers). 4x4 because 16 pixels are computed per clock. This is for the data coming of SPO and SP2. |
| SQ_SX1_ write_mask | SQ→SP1    | 8    | Result of pixel kill in the shader pipe, which must be output for all pixel exports (depth and all color buffers). 4x4 because 16 pixels are computed per clock. This is for the data coming of SP1 and SP3. |

### 23.2.14 SP to SQ: Constant address load/ Predicate Set/Kill set

| Name              | Direction | Bits | Description                                                                                                     |
|-------------------|-----------|------|-----------------------------------------------------------------------------------------------------------------|
| SP0_SQ_const_addr | SP0→SQ    | 36   | Constant address load / predicate vector load (4 bits only)/<br>Kill vector load (4 bits only) to the sequencer |
| SP0_SQ_valid      | SP0→SQ    | 1    | Data valid                                                                                                      |
| SP1_SQ_const_addr | SP1→SQ    | 36   | Constant address load / predicate vector load (4 bits only)/<br>Kill vector load (4 bits only) to the sequencer |
| SP1_SQ_valid      | SP1→SQ    | 1    | Data valid                                                                                                      |
| SP2_SQ_const_addr | SP2→SQ    | 36   | Constant address load / predicate vector load (4 bits only)/<br>Kill vector load (4 bits only) to the sequencer |
| SP2_SQ_valid      | SP2→SQ    | 1    | Data valid                                                                                                      |
| SP3_SQ_const_addr | SP3→SQ    | 36   | Constant address load / predicate vector load (4 bits only)/<br>Kill vector load (4 bits only) to the sequencer |
| SP3_SQ_valid      | SP3→SQ    | 1    | Data valid                                                                                                      |
| SP0 SQ data type  | SP→SQ     | 2    | Data Type                                                                                                       |
|                   |           |      | 0: Constant Load                                                                                                |
|                   |           |      | 1: Predicate Set                                                                                                |
|                   |           |      | 2: Kill vector load                                                                                             |

# Because of the sharing of the bus none of the MOVA, PREDSET or KILL instructions may be coissued.

# 23.2.15 SQ to SPx: constant broadcast

| Name         | Direction | Bits | Description        |
|--------------|-----------|------|--------------------|
| SQ_SPx_const | SQ→SPx    | 128  | Constant broadcast |

# 23.2.16 SQ to CP: RBBM bus

| Name           | Direction | Bits | Description          | 1 |
|----------------|-----------|------|----------------------|---|
| SQ_RBB_rs      | SQ→CP     | 1    | Read Strobe          | 1 |
| SQ_RBB_rd      | SQ→CP     | 32   | Read Data            | 1 |
| SQ_RBBM_nrtrtr | SQ→CP     | 1    | Optional             | 1 |
| SQ_RBBM_rtr    | SQ→CP     | 1    | Real-Time (Optional) | - |

### 23.2.17 CP to SQ: RBBM bus

| Name    | Direction | Bits | Description                        |  |
|---------|-----------|------|------------------------------------|--|
| rbbm_we | CP→SQ     | 1    | Write Enable                       |  |
| rbbm_a  | CP→SQ     | 15   | Address Upper Extent is TBD (16:2) |  |
| rbbm_wd | CP→SQ     | 32   | Data                               |  |

| / | N       | 24 September, 2001 <u>4 September, 201541</u> GEN-CXXXX |       |  | DOCUMENT-REV. NUM.<br>GEN-CXXXXX-REVA | PAGE<br>47 of 51 |                 |  |
|---|---------|---------------------------------------------------------|-------|--|---------------------------------------|------------------|-----------------|--|
|   | rbbm_be |                                                         | CP→SQ |  | 4                                     | Byte E           | nables          |  |
|   | rbbm_re |                                                         | CP→SQ |  | 1                                     | Read E           | Enable          |  |
|   | rbb_rs0 |                                                         | CP→SQ |  | 1                                     | Read F           | Return Strobe 0 |  |
|   | rbb_rs1 |                                                         | CP→SQ |  | 1                                     | Read F           | Return Strobe 1 |  |
|   | rbb_rd0 |                                                         | CP→SQ |  | 32                                    | Read [           | Data 0          |  |
|   | rbb_rd1 |                                                         | CP→SQ |  | 32                                    | Read D           | Data 0          |  |
|   | RBBM_SQ | _soft_reset                                             | CP→SQ |  | 1                                     | Soft Re          | eset            |  |

### 23.2.18 SQ to CP: State report

| Name             | Direction | Bits | Description            | 1  |
|------------------|-----------|------|------------------------|----|
| SQ_CP_vs_event   | SQ→CP     | 1    | Vertex Shader Event    |    |
| SQ_CP_vs_eventid | SQ→CP     | 45   | Vertex Shader Event ID | 16 |
| SQ_CP_ps_event   | SQ→CP     | 1    | Pixel Shader Event     | 1  |
| SQ CP ps eventid | SQ→CP     | 45   | Pixel Shader Event ID  | 10 |

# 23.3 Example of control flow program execution

We now provide some examples of execution to better illustrate the new design.

#### Given the program:

Alu 0

Alu 1 Tex 0 Tex 1 Alu 3 Serial Alu 4 Tex 2 Alu 5 Alu 6 Serial Тех 3 Alu 7 Alloc Position 1 buffer Alu 8 Export Tex 4 Alloc Parameter 3 buffers Alu 9 Export 0 Tex 5

Alu 10 Serial Export 2 Alu 11 Export 1 End

#### Would be converted into the following CF instructions:

Execute 0 Alu 0 Alu 0 Tex 0 Tex 1 Alu 0 Alu 0 Tex 0 Alu 1 Alu 0 Tex Execute 0 Alu Alloc Position 1 Execute 0 Alu 0 Tex Alloc Param 3 Execute\_end 0 Alu 0 Tex 1 Alu 0 Alu

And the execution of this program would look like this:

#### Put thread in Vertex RS:

Control Flow Instruction Pointer (12 bits), (CFP) Execution Count Marker (3 or 4 bits), (ECM) Loop Iterators (4x9 bits), (LI) Call return pointers (4x12 bits), (CRP)



EDIT DATE
4 September, 201544

R400 Sequencer Specification

PAGE 48 of 51

Predicate Bits(4x64 bits), (PB)
Export ID (1 bit), (EXID)
GPR Base Ptr (8 bits), (GPR)
Export Base Ptr (7 bits), (EB)
Context Ptr (3 bits), (CPTR)
LOD correction bits (16x6 bits) (LOD)

| State Bi | its |    |     |    |      |     |    |      |     |   |
|----------|-----|----|-----|----|------|-----|----|------|-----|---|
| CFP      | ECM | LI | CRP | РВ | EXID | GPR | EB | CPTR | LOD |   |
| 0        | 0   | 0  | 0   | 0  | 0    | 0   | 0  | 0    | 0   | 1 |

Valid Thread (VALID)

Texture/ALU engine needed (TYPE)
Texture Reads are outstanding (PENDING)
Waiting on Texture Read to Complete (SERIAL)

Allocation Wait (2 bits) (ALLOC)

00 – No allocation needed

01 - Position export allocation needed (ordered export)

10 - Parameter or pixel export needed (ordered export)

11 - pass thru (out of order export)

Allocation Size (4 bits) (SIZE)

Position Allocated (POS\_ALLOC)

First thread of a new context (FIRST)

Last (1 bit), (LAST)

| Status Bits | ;    |         |        |       |      |           |       |      |
|-------------|------|---------|--------|-------|------|-----------|-------|------|
| VALID       | TYPE | PENDING | SERIAL | ALLOC | SIZE | POS_ALLOC | FIRST | LAST |
| 1           | ALU  | 0       | 0      | 0     | 0    | 0         | 1     | 0    |

Then the thread is picked up for the execution of the first control flow instruction:

Execute O Alu O Alu O Tex O Tex 1 Alu O Alu O Tex O Alu 1 Alu O Tex

It executes the first two ALU instructions and goes back to the RS for a resource request change. Here is the state returned to the RS:

| State Bit | State Bits |    |     |    |      |     |    |      |     |  |  |
|-----------|------------|----|-----|----|------|-----|----|------|-----|--|--|
| CFP       | ECM        | LI | CRP | PB | EXID | GPR | EB | CPTR | LOD |  |  |
| 0         | 2          | 0  | 0   | 0  | 0    | 0   | 0  | 0    | 0   |  |  |

| Status Bits |      | A       |        |       |      |           |       |      |  |
|-------------|------|---------|--------|-------|------|-----------|-------|------|--|
| VALID       | TYPE | PENDING | SERIAL | ALLOC | SIZE | POS_ALLOC | FIRST | LAST |  |
| 1           | TEX  | 0       | 0      | 0     | 0    | 0         | 1     | 0    |  |

Then when the texture pipe frees up, the arbiter picks up the thread to issue the texture reads. The thread comes back in this state:

| State Bits |     |    |     |    |      |     |    |      |     |  |
|------------|-----|----|-----|----|------|-----|----|------|-----|--|
| CFP        | ECM | LI | CRP | РВ | EXID | GPR | EB | CPTR | LOD |  |
| 0          | 4   | 0  | 0   | 0  | 0    | 0   | 0  | 0    | 0   |  |

| Status Bi |      |         |        |       | WOOD STATE OF STATE O |           |       |      |
|-----------|------|---------|--------|-------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------|-------|------|
| VALID     | TYPE | PENDING | SERIAL | ALLOC | SIZE                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           | POS_ALLOC | FIRST | LAST |
| 1         | ALU  | 1       | 1      | 0     | 0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | 0         | 1     | 0    |



EDIT DATE
4 September, 201511

DOCUMENT-REV. NUM. GEN-CXXXXX-REVA PAGE 49 of 51

Because of the serial bit the arbiter must wait for the texture to return and clear the PENDING bit before it can pick the thread up. Lets say that the texture reads are complete, then the arbiter picks up the thread and returns it in this state:

| State Bits |     |    | *************************************** |    |      |     |    |      |     |
|------------|-----|----|-----------------------------------------|----|------|-----|----|------|-----|
| CFP        | ECM | LI | CRP                                     | РВ | EXID | GPR | EВ | CPTR | LOD |
| 0          | 6   | 0  | 0                                       | 0  | 0    | 0   | 0  | 0    | 0   |

| Status Bi | ts   |         |        |       |      |           |       |      |  |
|-----------|------|---------|--------|-------|------|-----------|-------|------|--|
| VALID     | TYPE | PENDING | SERIAL | ALLOC | SIZE | POS_ALLOC | FIRST | LAST |  |
| 1         | TEX  | 0       | 0      | 0     | 0    | 0         | 1     | 0    |  |

Again the TP frees up, the arbiter picks up the thread and executes. It returns in this state:

| State Bits |     |    |     |    |      |     |    |      |     |  |  |  |
|------------|-----|----|-----|----|------|-----|----|------|-----|--|--|--|
| CFP        | ECM | LI | CRP | РВ | EXID | GPR | EB | CPTR | LOD |  |  |  |
| 0          | 7   | 0  | 0   | 0  | 0    | 0   | 0  | 0    | 0   |  |  |  |

| Status Bits | S    |         |        |       |      |           |       |      |
|-------------|------|---------|--------|-------|------|-----------|-------|------|
| VALID       | TYPE | PENDING | SERIAL | ALLOC | SIZE | POS_ALLOC | FIRST | LAST |
| 1           | ALU  | 1       | 0      | 0     | 0    | 0         | 1     | 0    |

Now, even if the texture has not returned we can still pick up the thread for ALU execution because the serial bit is not set. The thread will however come back to the RS for the second ALU instruction because it has the serial bit set.

| State Bits |     |    |     |    |      |     |    |      |     |  |
|------------|-----|----|-----|----|------|-----|----|------|-----|--|
| CFP        | ECM | LI | CRP | РВ | EXID | GPR | EB | CPTR | LOD |  |
| 0          | 8   | 0  | 0   | 0  | 0    | 0   | 0  | 0    | 0   |  |

| Status Bi | ts   |         |        |       |      |           |       |      |
|-----------|------|---------|--------|-------|------|-----------|-------|------|
| VALID     | TYPE | PENDING | SERIAL | ALLOC | SIZE | POS_ALLOC | FIRST | LAST |
| 1         | ALU  | 1       | 1      | 0     | 0    | 0         | 1     | 0    |

As soon as the TP clears the pending bit the thread is picked up and returns:

| State Bits |     |    |     |    |      |     |    |      |     |
|------------|-----|----|-----|----|------|-----|----|------|-----|
| CFP        | ECM | LI | CRP | РВ | EXID | GPR | EB | CPTR | LOD |
| 0          | 9   | 0  | 0   | 0  | 0    | 0   | 0  | 0    | 0   |

| Status Bi | ts   |         |        |       |      |           |       |      |  |
|-----------|------|---------|--------|-------|------|-----------|-------|------|--|
| VALID     | TYPE | PENDING | SERIAL | ALLOC | SIZE | POS_ALLOC | FIRST | LAST |  |
| 1         | TEX  | 0       | 0      | 0     | 0    | 0         | 1     | 0    |  |

Picked up by the TP and returns:

Execute 0 Alu

Exhibit 2034.docR400\_Sequencer.doc 73365 Bytes\*\*\* © ATI Confidential. Reference Copyright Notice on Cover Page © \*\*\*

AMD1044\_0257708



EDIT DATE 4 September, 201511 R400 Sequencer Specification

PAGE 50 of 51

0

State Bits

| - |     | HUNCONS AND | CONTRACTOR DE LA CONTRA |     | MINISTER AND ANALOGO ANALOGO AND ANALOGO ANALO |      | *************************************** | NAMES OF THE PARTY |      |     |
|---|-----|-------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------|-----------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------|-----|
|   | CFP | ECM                                             | LI                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             | CRP | PB                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             | EXID | GPR                                     | EB                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             | CPTR | LOD |
|   | 1   | 0                                               | 0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | 0   | 0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | 0    | 0                                       | 0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | 0    | 0   |

Status Rits

| Status Dits |      |         |        |       |      |           |       |      |
|-------------|------|---------|--------|-------|------|-----------|-------|------|
|             |      |         |        |       |      |           |       |      |
| VALID       | TYPE | PENDING | SERIAL | ALLOC | SIZE | POS_ALLOC | FIRST | LAST |
| 1           | ALU  | 1       | 0      | 0     | 0    | 0         | 1     | 0    |

Picked up by the ALU and returns (lets say the TP has not returned yet):

Alloc Position 1

| State bits | State Bits |    |     |    |      |     |    |      |     |  |  |  |
|------------|------------|----|-----|----|------|-----|----|------|-----|--|--|--|
| CFP        | ECM        | LI | CRP | РВ | EXID | GPR | EB | CPTR | LOD |  |  |  |
| 2          | 0          | 0  | 0   | 0  | 0    | 0   | 0  | 0    | 0   |  |  |  |

Status Rite

| Status Dits |      |          |        |       |      |           |       |      |
|-------------|------|----------|--------|-------|------|-----------|-------|------|
| VALID       | TVDE | DENIDING | OFDIAL | ALLOC | CIZE | DOS ALLOC | CIDAT | LACT |
| VALID       | ITPE | PENDING  | SERIAL | ALLUC | SIZE | POS_ALLOC | FIRST | LASI |
| 1           | ALU  | 1        | 0      | 01    | 1    | 0         | 1     | 0    |

If the SX has the place for the export, the SQ is going to allocate and pick up the thread for execution. It returns to the RS in this state:

Execute 0 Alu 0 Tex

State Rite

| State Bits | State Dits |    |     |    |      |     |    |      |     |  |  |  |
|------------|------------|----|-----|----|------|-----|----|------|-----|--|--|--|
| CFP        | ECM        | LI | CRP | РВ | EXID | GPR | EB | CPTR | LOD |  |  |  |
| 3          | 1          | 0  | 0   | 0  | 0    | 0   | 0  | 0    | 0   |  |  |  |

| Status Bits |      |         |        |       |      |           |       |      |
|-------------|------|---------|--------|-------|------|-----------|-------|------|
| VALID       | TYPE | PENDING | SERIAL | ALLOC | SIZE | POS ALLOC | FIRST | LAST |

Now, since the TP has not returned yet, we must wait for it to return because we cannot issue multiple texture requests. The TP returns, clears the PENDING bit and we proceed:

0

0

Alloc Param 3

TEX

| State Bits |     | *************************************** |     |    |      |     |    |      |     |
|------------|-----|-----------------------------------------|-----|----|------|-----|----|------|-----|
| CFP        | ECM | LI                                      | CRP | PB | EXID | GPR | EB | CPTR | LOD |
| 4          | 0   | 0                                       | 0   | 0  | 1    | 0   | 0  | 0    | 0   |

Status Rite

| Status Bits |      |         |        |       |      |           |       |      |
|-------------|------|---------|--------|-------|------|-----------|-------|------|
| VALID       | TYPE | PENDING | SERIAL | ALLOC | SIZE | POS_ALLOC | FIRST | LAST |
| 1           | ALU  | 1       | 0      | 10    | 3    | 1         | 1     | 0    |

Once again the SQ makes sure the SX has enough room in the Parameter cache before it can pick up this thread.

Execute\_end 0 Alu 0 Tex 1 Alu 0 Alu

| ORIGINATE DATE 24 September, 200 |     | -, | 4 Septem | T DATE |      | ENT-REV. NUM. |     | E<br>51 |     |  |
|----------------------------------|-----|----|----------|--------|------|---------------|-----|---------|-----|--|
| State Bits                       |     |    |          |        |      |               |     |         |     |  |
| CFP                              | ECM | LI | CRP      | PB     | EXID | GPR           | EB  | CPTR    | LOD |  |
| 5                                | 1   | 0  | 0        | 0      | 1    | 0             | 100 | 0       | 0   |  |

| Status Bits | 3    |         |        |       |      |           |       |      |
|-------------|------|---------|--------|-------|------|-----------|-------|------|
| VALID       | TYPE | PENDING | SERIAL | ALLOC | SIZE | POS_ALLOC | FIRST | LAST |
| 1           | TEX  | 1       | 0      | 0     | 0    | 1         | 1     | 0    |

This executes on the TP and then returns:

| State Bi |     |    |     |    | Good Good Garage Control of Contr |     |     |      | 00000000000000000000000000000000000000 |  |
|----------|-----|----|-----|----|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----|-----|------|----------------------------------------|--|
| CFP      | ECM | LI | CRP | PB | EXID                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           | GPR | EB  | CPTR | LOD                                    |  |
| 5        | 2   | 0  | 0   | 0  | 1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | 0   | 100 | 0    | 0                                      |  |

| Status Bit | S    |         |        |       |      | *************************************** |       |      |
|------------|------|---------|--------|-------|------|-----------------------------------------|-------|------|
| VALID      | TYPE | PENDING | SERIAL | ALLOC | SIZE | POS_ALLOC                               | FIRST | LAST |
| 1          | ALU  | 1       | 1      | 0     | 0    | 1                                       | 1     | 1    |

Waits for the TP to return because of the textures reads are pending (and SERIAL in this case). Then executes and does not return to the RS because the LAST bit is set. This is the end of this thread and before dropping it on the floor, the SQ notifies the SX of export completion.

# 24. Open issues

Need to do some testing on the size of the register file as well as on the register file allocation method (dynamic VS static).

Saving power?