# Accelerated Graphics Port Interface Specification

Revision 1.0

Intel Corporation

July 31, 1996

Intel may have patents and/or patent applications related to the various Accelerated Graphics Port (AGP or A.G.P). interfaces described in the *Accelerated Graphics Port Interface Specification*. A reciprocal, royalty-free license to the electrical interfaces and bus protocols described in, and required by, the *Accelerated Graphics Port Interface Specification Revision* 1.0 is available from Intel.

> Samsung Exhibit 1024 Petitioners HTC & LG - Exhibit 1024, p. 1

Accelerated Graphics Port Interface Specification

Copyright © Intel Corporation 1996

All rights reserved.

THIS SPECIFICATION IS PROVIDED "AS IS" WITH NO WARRANTIES WHATSOEVER INCLUDING ANY WARRANTY OF MERCHANTABILITY, FITNESS FOR ANY PARTICULAR PURPOSE OR ANY WARRANTY OTHERWISE ARISING OUT OF ANY PROPOSAL, SPECIFICATION, OR SAMPLE. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED OR INTENDED HEREBY.

INTEL DISCLAIMS ALL LIABILITY, INCLUDING LIABILITY FOR INFRINGEMENT OF ANY PROPRIETARY RIGHTS, RELATING TO USE OF INFORMATION IN THIS SPECIFICATION. INTEL DOES NOT WARRANT OR REPRESENT THAT SUCH USE WILL NOT INFRINGE SUCH RIGHTS.

\*THIRD-PARTY BRANDS AND NAMES ARE THE PROPERTY OF THEIR RESPECTIVE OWNERS.

July 31, 1996

## Table of Contents

| 1. INTRODUCTION                                | 1                |
|------------------------------------------------|------------------|
| 1.1 MOTIVATION.<br>1.2 RELATIONSHIP TO PCI     | 1<br>1           |
| 2. ARCHITECTURAL CONTEXT AND SCOPE             | 3                |
| 2.1 TWO USAGE MODELS: "EXECUTE" & "DMA"        | 3<br>4<br>6<br>6 |
| 3. SIGNALS AND PROTOCOL SPECIFICATION          | 9                |
| 3.1 A.G.P. OPERATION OVERVIEW                  | 9                |
| 3.1.1 Pipeline Operation                       | 9                |
| 3.1.2 Addressing Modes and Bus Operations      | 1                |
| 3.1.3 Address Demultiplexing Option            | 3                |
| 3.2 ACCESS ORDERING RULES & FLOW CONTROL       | 6                |
| 3.2.1 Ordering Rules and Implications          | 6                |
| 3.2.2 Deadlock Avoidance                       | 9                |
| 3.2.3 Flush and Fence Commands                 | 9                |
| 3.2.4 Access Request Priority                  | 0                |
| 3.2.5 Flow Control                             | 1                |
| 3.2.5.1 Introduction                           | 1                |
| 3.2.5.2 Read Flow Control                      | 2                |
| 3.2.5.3 Write Data Flow Control                | 5                |
| 3.2.5.4 1 and 2 Clock Rule for IRDY# and TRDY# | 7                |
| 3.2.5.5 Other Flow Control Rules               | 7                |
| 3.2.6 Source Throttling Address Flow Control   | 7                |
| 3.3 PIN DESCRIPTION                            | 8                |
| 3.5 BUS TRANSACTIONS                           | 3                |
| 3.5.1 Address Transactions                     | 3                |
| 3.5.1.1 AD Bus                                 | 3                |
| 3.5.1.2 SBA Port                               | 5                |
| 3.5.2 Basic Data Transactions                  | 6                |

iii

| 3.5.2.1 1x Data Transfers                                                                                                                                            |    |
|----------------------------------------------------------------------------------------------------------------------------------------------------------------------|----|
| 3.5.2.2 2x Data Transfers                                                                                                                                            |    |
| 3.5.3 Flow Control                                                                                                                                                   |    |
| 3.5.3.1 Initial Data Block                                                                                                                                           |    |
| 3.5.3.2 Subsequent Data Block                                                                                                                                        |    |
| 3.6 ARBITRATION SIGNALING RULES                                                                                                                                      |    |
| 3.6.1 Introduction                                                                                                                                                   | 51 |
| 3.6.2 A.G.P. Compliant Master's <b>REQ</b> #                                                                                                                         |    |
| 3.6.3 GNT# and ST[2::0]                                                                                                                                              |    |
| 3.6.4 <b>GNT#</b> for Single Transactions                                                                                                                            |    |
| 3.6.5 GNT# Pipelining                                                                                                                                                | 54 |
| 3.6.6 GNT# Interaction with RBF#                                                                                                                                     | 61 |
| 3.7 A.G.P. SEQUENCER STATE MACHINE EQUATIONS                                                                                                                         | 61 |
| 3.7.1 Error Reporting                                                                                                                                                |    |
| 4. ELECTRICAL SPECIFICATION                                                                                                                                          |    |
| 4.1 OVERVIEW                                                                                                                                                         |    |
| 4.1.1 Introduction                                                                                                                                                   |    |
| 4.1.2 1X Transfer Mode Operation                                                                                                                                     |    |
| 4.1.3 2X Transfer Mode Operation                                                                                                                                     |    |
| 4.1.3.1 Transmit/Receive Outer Loop                                                                                                                                  | 65 |
| 4.1.3.2 Transmit to Receive Inner loop                                                                                                                               | 65 |
| 4.1.3.3 Transmit Outer to Inner Loop                                                                                                                                 |    |
|                                                                                                                                                                      |    |
| 4.1.3.4 Receive Inner to Outer Loop                                                                                                                                  |    |
| <ul><li>4.1.3.4 Receive Inner to Outer Loop</li><li>4.1.3.5 SB_STB Synchronization</li></ul>                                                                         | 68 |
| <ul><li>4.1.3.4 Receive Inner to Outer Loop</li><li>4.1.3.5 SB_STB Synchronization</li><li>4.2 COMPONENT SPECIFICATION</li></ul>                                     |    |
| <ul> <li>4.1.3.4 Receive Inner to Outer Loop</li> <li>4.1.3.5 SB_STB Synchronization</li> <li>4.2 COMPONENT SPECIFICATION</li></ul>                                  |    |
| <ul> <li>4.1.3.4 Receive Inner to Outer Loop</li> <li>4.1.3.5 SB_STB Synchronization</li> <li>4.2 COMPONENT SPECIFICATION</li> <li>4.2.1 DC SPECIFICATIONS</li></ul> |    |
| <ul> <li>4.1.3.4 Receive Inner to Outer Loop</li></ul>                                                                                                               |    |
| <ul> <li>4.1.3.4 Receive Inner to Outer Loop</li></ul>                                                                                                               |    |
| <ul> <li>4.1.3.4 Receive Inner to Outer Loop</li></ul>                                                                                                               |    |

| 4.2.2.3 Measurement and Test Conditions               |    |
|-------------------------------------------------------|----|
| 4.2.3 Signal Integrity Requirement                    |    |
| 4.2.4 Driver Characteristics                          |    |
| 4.2.5 Receiver Characteristics                        | 79 |
| 4.2.6 Maximum AC Ratings and Device Protection        |    |
| 4.2.7 Component Pinout Recommendations                |    |
| 4.3 MOTHERBOARD SPECIFICATION                         |    |
| 4.3.1 System Timing Budget                            |    |
| 4.3.2 Clock Skew                                      |    |
| 4.3.3 Reset                                           |    |
| 4.3.4 Interconnect Delay                              |    |
| 4.3.5 Physical Requirements                           |    |
| 4.3.5.1 Interface Signaling                           |    |
| 4.3.5.2 Pullups                                       |    |
| 4.3.5.3 Signal Routing and Layout                     |    |
| 4.3.5.4 Impedances                                    |    |
| 4.3.5.5 Vref Generation                               |    |
| 4.3.5.6 Crosstalk Consideration                       |    |
| 4.3.5.7 Line Termination                              |    |
| 4.4 ADD-IN CARD SPECIFICATION                         |    |
| 4.4.1 Clock Skew                                      |    |
| 4.4.2 Interconnect Delay                              |    |
| 4.4.3 Physical Requirements                           |    |
| 4.4.3.1 Pin Assignment                                |    |
| 4.4.3.2 Signal Routing and Layout                     |    |
| 4.4.3.3 Impedances                                    |    |
| 4.4.3.4 Vref Generation                               |    |
| 4.4.3.5 Power supply delivery                         |    |
| 5. MECHANICAL SPECIFICATION                           |    |
| 5.1 INTRODUCTION                                      |    |
| 5.2 EXPANSION CARD PHYSICAL DIMENSIONS AND TOLERANCES |    |

v

| 5.3 CONNECTOR:                                          |    |
|---------------------------------------------------------|----|
| 5.5 PLANAR IMPLEMENTATION                               | 93 |
| 5.5.1 ATX Planar Implementation                         |    |
| 5.5.2 Low Profile Planar Implementation                 | 94 |
| 5.5.3 Pin List                                          | 95 |
| 6. SYSTEM CONFIGURATION AND A.G.P. INITIALIZATION       | 97 |
| 6.1 POST-TIME INITIALIZATION                            | 97 |
| 6.1.1 A.G.P. Compliant Master Devices                   |    |
| 6.1.2 A.G.P. Compliant Target Devices                   |    |
| 6.1.3 Boot-time VGA Display Device(s)                   |    |
| 6.1.4 Operating System Initialization                   |    |
| 6.1.5 PCI Status Register                               |    |
| 6.1.6 Capabilities Pointer - (offset 34h)               |    |
| 6.1.7 Capability Identifier Register (Offset = CAP_PTR) |    |
| 6.1.8 A.G.P. status register (offset CAP_PTR + 4)       |    |
| 6.1.9 A.G.P. command register - (offset CAP_PTR + 8)    |    |
| 7. APPENDIX A                                           |    |
| 8. APPENDIX B                                           |    |

## List of Figures

| EICLIDE 1 1 SYSTEM DLOCK DIAGDAM: A G D AND DOLDELATIONSHID            | r      |
|------------------------------------------------------------------------|--------|
| FIGURE 2-1 GRAPHICS ADDRESS RE-MAPPING FUNCTION                        | ∠<br>⊿ |
| FIGURE 2-2 A G P ACCESS OUFLING MODEL                                  |        |
| FIGURE 2-3 DIFFERENT CORE LOGIC ARCHITECTURES                          |        |
| FIGURE 3-1 BASIC A G.P. PIPELINE CONCEPT                               | .10    |
| FIGURE 3-2 A.G.P./PCI OPERATIONAL STATES.                              | 10     |
| FIGURE 3-3 LAYOUT OF AN A.G.P. ACCESS REOUEST                          | .12    |
| FIGURE 3-4 MAXIMUM NUMBER OF GNT#S OUEUED BEFORE THE ASSERTION OF RBF# | 24     |
| FIGURE 3-5 MAXIMUM DELAY BY MASTER ON WRITE DATA.                      | .26    |
| FIGURE 3-6 WRITE DATA WITH ONE TP.                                     | .27    |
| FIGURE 3-7 SINGLE ADDRESS - NO DELAY BY MASTER                         | 34     |
| FIGURE 3-8 MULTIPLE ADDRESSES ENQUEUED, MAXIMUM DELAY BY MASTER        | 35     |
| FIGURE 3-9 1X SIDEBAND ADDRESSING                                      | 36     |
| FIGURE 3-10 2X SIDE BAND ADDRESSING                                    | 36     |
| FIGURE 3-11 MINIMUM DELAY BY TARGET OF READ TRANSACTION                | 37     |
| FIGURE 3-12 MINIMUM DELAY ON BACK TO BACK READ DATA                    | 38     |
| FIGURE 3-13 MASTER DOES NOT DELAY PROVIDING WRITE DATA                 | .38    |
| FIGURE 3-14 BACK TO BACK WRITE DATA TRANSFERS - NO DELAYS              | .39    |
| FIGURE 3-15 2X READ DATA NO DELAY                                      | 40     |
| FIGURE 3-16 2X BACK TO BACK READ DATA - NO DELAY                       | 40     |
| FIGURE 3-17 2X BASIC WRITE NO DELAY                                    | 41     |
| FIGURE 3-18 QUADWORD WRITES BACK TO BACK - NO DELAYS                   | 41     |
| FIGURE 3-19 MASTER'S DATA BUFFER IS FULL - NO DELAY ON READ DATA       | .42    |
| FIGURE 3-20 HIGH PRORITY READ DATA RETURNED WHILE RBF# ASSERTED        | 42     |
| FIGURE 3-21 2X READ DATA WITH RBF# ASSERTED                            | .43    |
| FIGURE 3-22 THROTTLE POINT FOR SUBSEQUENT DATA BLOCK - NO DELAY        | .44    |
| FIGURE 3-23 MASTER DELAYS SUBSEQUENT DATA BLOCK                        | .44    |
| FIGURE 3-24 WRITE WITH SUBSEQUENT BLOCK - NO DELAY                     | 45     |
| FIGURE 3-25 TARGET DELAYS SUBSEQUENT WRITE DATA BLOCK                  | 46     |
| FIGURE 3-26 EARLIEST READ DATA RETURNED AFTER A REQUEST                | 46     |
| FIGURE 3-27 REQUEST FOLLOWED BY WRITE AND THEN A READ.                 | 47     |
| FIGURE 3-28 REQUEST FOLLOWED BY READ DATA NO DELAY BY TARGET.          | 48     |
| FIGURE 3-29 REQUEST FOLLOWED BY WRITE - NO AD TURN AROUND              | 49     |
| FIGURE 3-30 BASIC PCI TRANSACTION ON A.G.P.                            | . 50   |
| FIGURE 3-31 PCI TRANSACTION BETWEEN A.G.P. REQUEST AND DATA            | 51     |
| FIGURE 3-32 WRITE DATA FOLLOWED BY READ                                | 54     |
| FIGURE 3-33 GNT# ASSERTION FOR 16, 8, AND THEN 16 BYTE READ TRANSFERS  | 57     |
| FIGURE 3-34 GNT# ASSERTION FOR NEXT READ DATA AFTER LONG DATA TRANSFER | 57     |
| FIGURE 3-35 GNT# ASSERTION FOR BACK TO BACK WRITE DATA TRANSFERS       | 58     |
| FIGURE 3-36 BACK TO BACK GNT# WITH DELAY ON INITIAL TRANSFER           | 59     |
| FIGURE 3-37 PIPELINED GNT#S - READ AND WRITES (PART 1)                 | 60     |
| FIGURE 3-38 PIPELINED GNT#S - READ AND WRITES (PART 2)                 | 60     |
| FIGURE 3-39 LP GNT# PIPELINING STOPPED WHILE RBF# IS ASSERTED          | 61     |
| FIGURE 4-1 2X MODE TIME DOMAINS                                        | 64     |
| FIGURE 4-2 TRANSMIT STROBE/DATA TIMINGS                                | 65     |
| FIGURE 4-3 RECEIVE STROBE/DATA TIMINGS                                 | 65     |
| FIGURE 4-4 TRANSMIT STROBE/CLOCK TIMINGS                               | 66     |
| FIGURE 4-5 COMPOSITE TRANSMIT TIMINGS                                  | 66     |
| FIGURE 4-6 RECEIVER INNER TO OUTER LOOP TRANSFER TIMING                | 67     |
| FIGURE 4-7 COMPOSITE RECEIVE TIMINGS                                   | 68     |
| FIGURE 4-8 SB_STB SYNCHRONIZATION PROTOCOL                             | 69     |
| FIGURE 4-9 A.G.P. 1X TIMING DIAGRAM                                    | 73     |
| FIGURE 4-10 A.G.P. 133 TIMING DIAGRAM                                  | 75     |
| FIGURE 4-11 STROBE/DATA TURNAROUND TIMINGS                             | 75     |
| FIGURE 4-12 OUTPUT TIMING MEASUREMENT CONDITIONS                       | 76     |
| FIGURE 4-13 INPUT TIMING MEASUREMENT CONDITIONS                        | /6     |

| FIGURE 4-14 T <sub>VAL</sub> (MAX) RISING EDGE                                            | .77  |
|-------------------------------------------------------------------------------------------|------|
| FIGURE 4-15 T <sub>VAL</sub> (MAX) FALLING EDGE                                           | .77  |
| FIGURE 4-16 T <sub>VAI</sub> (MIN) AND SLEW RATE                                          | . 78 |
| FIGURE 4-17 V/L CURVES FOR 3 3V SIGNALING                                                 | 79   |
| FIGURE 4-18 MAXIMUM AC WAVEFORMS FOR 3.3V SIGNALING                                       | . 80 |
| FIGURE 4-19 RECOMMENDED COMPONENT PINOUT                                                  | . 81 |
| FIGURE 4-20 CLOCK SKEW DIAGRAM                                                            | . 84 |
| FIGURE 5-1 A.G.P. CARD EDGE CONNECTOR                                                     | . 90 |
| FIGURE 5-2 DETAIL A, A.G.P. CARD EDGE FINGER LAYOUT                                       | . 90 |
| FIGURE 5-3 A.G.P. CARD EDGE CONNECTOR BEVEL                                               | . 91 |
| FIGURE 5-4 MOTHERBOARD CONNECTOR FOOTPRINT AND LAYOUT DEMINSIONS.                         | . 92 |
| FIGURE 5-5 MOTHERBOARD CONNECTOR LAYOUT RECOMMENDATION                                    | . 93 |
| FIGURE 5-6 TYPICAL ATX IMPLEMENTATION                                                     | . 94 |
| FIGURE 6-1 CONFIGURATION VIEW OF AN A.G.P. TARGET                                         | . 98 |
| FIGURE 6-2 LOCATION OF A.G.P. CAPABILITIES                                                | . 99 |
| FIGURE 8-1 SINGLE ADDRESS - NO DELAY BY MASTER.                                           | 105  |
| FIGURE 8-2 SINGLE ADDRESS - MAXIMUM DELAY BY MASTER                                       | 105  |
| FIGURE 8-3 MULTIPLE ADDRESS - NO DELAY BY MASTER                                          | 106  |
| FIGURE 8-4 MULTIPLE ADDRESS - MAXIMUM DELAY BY MASTER                                     | 106  |
| LICTIDE 8 C JA ZIDEDND V DDDESEDIC                                                        | 100  |
| FIGURE 8-0 ZA SIDEDID ADDRESSING. $\Box$                                                  | 107  |
| FIGURE 8-7 MINIMUM DELAY BY TARGET OF READ TRANSACTION                                    | 107  |
| FIGURE 8-9 MINIMUM DELAY OF RACK TO BACK READ DATA                                        | 108  |
| FIGURE 8-10 MAXIMUM DELAY OF BACK TO BACK READ DATA                                       | 100  |
| FIGURE 8-11 MINIMUM DELAY BY MASTER OF WRITE DATA                                         | 109  |
| FIGURE 8-12 MAXIMUM DELAY BY MASTER OF WRITE DATA                                         | 110  |
| FIGURE 8-13 MINIMUM DELAY BY MASTER OF BACK TO BACK WRITE DATA                            | 110  |
| FIGURE 8-14 MASTER DELAYS INITIAL WRITE NO DELAY                                          | 111  |
| FIGURE 8-15 2X READ DATA - NO DELAY                                                       | 111  |
| FIGURE 8-16 2X READ DATA - WITH DELAY                                                     | 112  |
| FIGURE 8-17 2X BACK TO BACK READ DATA - NO DELAY 1                                        | 112  |
| FIGURE 8-18 2X BACK TO BACK READ DATA - MAXIMUM DELAY 1                                   | 113  |
| FIGURE 8-19 2X WRITE DATA - NO DELAY                                                      | 113  |
| FIGURE 8-20 2X WRITE DATA - MAXIMUM DELAY                                                 | 114  |
| FIGURE 8-21 2X BACK TO BACK WRITES - NO DELAY                                             | 114  |
| FIGURE 8-22 2X BACK TO BACK WRITES - MAXIMUM DELAY                                        | 115  |
| FIGURE 8-23 2X WRITES, INITIAL TRANSACTION WITH DELAY, SUBSEQUENT TRANSACTIONS NO DELAY ] | 115  |
| FIGURE 8-24 MASTER DATA BUFFER FULL - MINIMUM DELAY FOR READ DATA                         | 116  |
| FIGURE 8-25 MASTER DATA BUFFER FULL - MAXIMUM DELAY FOR KEAD DATA                         | 116  |
| FIGURE 8-26 KBF# ASSERTED, HP KEAD DATA KETUKNED                                          | 110  |
| FIGURE 8-27 KBF# ASSERTED, MAXIMUM DELAY BY TARGET, HP READ DATA RETURNED                 | 117  |
| FIGURE 8-20 2X READ DATA, RDF# ASSERTED, MAXIMUM DELAT DT TARGET                          | 117  |
| FIGURE 8-29 2X READ, RDF# ASSERTED, NO III READ DATA                                      | 118  |
| FIGURE 8-31 2X READ DATA, REF# ASSERTED HP READ DATA DELAYED                              | 119  |
| FIGURE 8-32 READ DATA NO DELAY SUBSEQUENT BLOCK                                           | 119  |
| FIGURE 8-33 READ DATA, MASTER READY - TARGET DELAYS SUBSEQUENT BLOCK                      | 120  |
| FIGURE 8-34 READ DATA, TARGET READY - MASTER DELAYS SUBSEQUENT BLOCK                      | 120  |
| FIGURE 8-35 READ DATA, TARGET DELAYS 1 CLOCK - MASTER DELAYS 2 CLOCKS SUBSEQUENT BLOCK 1  | 121  |
| FIGURE 8-36 WRITE DATA, NO DELAY SUBSEQUENT BLOCK                                         | 121  |
| FIGURE 8-37 WRITE DATA, TARGET DELAYS SUBSEQUENT BLOCK 2 CLOCKS                           | 122  |
| FIGURE 8-38 2X WRITE NO DELAY INITIAL OR SUBSEQUENT BLOCK                                 | 122  |
| FIGURE 8-39 2X WRITE, INITIAL DELAY, NO DELAY SUBSEQUENT BLOCK 1                          | 123  |
| FIGURE 8-40 2X WRITE, NO INITIAL DELAY, 2 CLOCKS DELAY SUBSEQUENT BLOCK 1                 | 123  |
| FIGURE 8-41 EARLIEST READ DATA AFTER REQUEST                                              | 124  |
| FIGURE 8-42 MAXIMUM DELAY READ DATA AFTER REQUEST                                         | 124  |
| FIGURE 8-43 REQUEST FOLLOWED BY LONG DATA READ                                            | 125  |

|                                                                                             | 105 |
|---------------------------------------------------------------------------------------------|-----|
| FIGURE 8-44 EARLY GN 1# FOR READ DATA                                                       | 125 |
| FIGURE 8-45 EARLY GNT# FOR READ DATA, READ DATA DELAYED                                     | 126 |
| FIGURE 8-46 2X READ, EARLY GNT#, DELAYED READ DATA                                          | 126 |
| FIGURE 8-47 DATA TRANSFERS INTERVENED WITH REQUESTS                                         | 127 |
| FIGURE 8-48 INTERVENED REQUEST, SUBSEQUENT 2X READ DATA DELAYED                             | 128 |
| FIGURE 8-49 REQUEST FOLLOWED BY WRITE DATA, FOLLOWED BY READ DATA                           | 128 |
| FIGURE 8-50 MULTIPLE REQUESTS FOLLOWED BY WRITE DATA - NO TURN-AROUND CYCLE                 | 129 |
| FIGURE 8-51 REQUEST FOLLOWED BY WRITE DATA DELAYED, FOLLOWED BY READ DATA - NO DELAY        | 129 |
| FIGURE 8-52 REQUEST FOLLOWED BY READ DATA, FOLLOWED BY WRITE DATA                           | 130 |
| FIGURE 8-53 REQUEST FOLLOWED BY 2X READ DATA NO DELAY                                       | 130 |
| FIGURE 8-54 2X WRITE FOLLOWED BY 2X READ WITH SUBSEQUENT BLOCK DELAYED                      | 131 |
| FIGURE 8-55 2X WRITE FOLLOWED BY 2X READ WITH INITIAL AND SUBSEQUENT BLOCKS DELAYED         | 131 |
| FIGURE 8-56 2X WRITE, 2X READ NO INITIAL DELAY, SUBSEQUENT BLOCK DELAYED BY TARGET          | 132 |
| FIGURE 8-57 2X WRITE FOLLOWED BY 2X READ, INITIAL AND SUBSEQUENT BLOCK DELAYED BY TARGET    | 132 |
| FIGURE 8-58 2X WRITE FOLLOWED BY 2X READ, MASTER DELAYS SUBSEQUENT BLOCK                    | 133 |
| FIGURE 8-59 2X WRITE FOLLOWED BY 2X READ INITIAL AND SUBSEQUENT BLOCKS DELAYED              | 133 |
| FIGURE 8-60 2X READS BACK TO BACK DIFFERENT LENGTHS                                         | 134 |
| FIGURE 8-61 MULTIPLE REQUESTS FOLLOWED BY 2X WRITE NO DELAY INITIAL OR SUBSEQUENT BLOCKS    | 134 |
| FIGURE 8-62 REOUESTS FOLLOWED BY 2X WRITE FARLY GNT# NO DELAY                               | 135 |
| FIGURE 8-63 SINGLE REQUEST. 2X WRITE 2X READ FARLY GNT#                                     | 135 |
| FIGURE 8-64 REQUEST FOLLOWED BY 2X READ FOLLOWED 2X WRITE LATE GNT#                         | 136 |
| FIGURE 8-65 SINGLE REQUEST FOLLOWED BY 2X WRITE AND 2X READ WITH DELAYS FARLY GNT#          | 136 |
| FIGURE 8-66 SINGLE REQUEST FOLLOWED BY 2X WRITE AND 2X READ FARLY GNT#                      | 137 |
| FIGURE 8-67 SINGLE REQUEST FOLLOWED BY 2A WRITE (DELAY) FOLLOWED BY 2X READ (DELAY)         | 137 |
| FIGURE 8-68 2X WRITE FOLLOWED BY 2X READ GNT# DELAY J CLOCK                                 | 138 |
| FIGURE 8-60-2X WRITE FOLLOWED BY 2X READ, GNT# DELATED T CLOCK                              | 130 |
| FIGURE 8-09 2X WRITE FOLLOWED BY 2X READ, GNT# NO DELAT.                                    | 120 |
| FIGURE 8-70 2X WRITE FOLLOWED BY 2X READ, ON 1# DELATED MAXIMUM WITH NO DEAD CLOCK          | 120 |
| FIGURE 8-71 SINGLE REQUEST, MAXIMUM 2X WRITE DELAT, 2X READ LATE ON 1#                      | 139 |
| FIGURE 8-72 SINGLE REQUEST, 2X WRITE, 2X READ LATE GN 1#                                    | 140 |
| FIGURE 8-75 SINGLE REQUEST, ZA WRITE, ZA READ EARLIEST GIN 1#5                              | 140 |
| FIGURE 8-74 PCI WRITE TRANSACTION ON A.G.P.                                                 | 141 |
| FIGURE 8-75 PCT TRANSACTION BETWEEN 2 A.G.P. DATA TRANSFERS                                 | 142 |
| FIGURE 8-76 SINGLE REQUEST, PCI TRANSACTION FOLLOWED BY READ DATA (NO DELAYS)               | 143 |
| FIGURE 8-77 PCI TRANSACTION FOLLOWED BY A.G.P. REQUEST, READ DATA (NO DELAYS)               | 144 |
| FIGURE 8-78 BTOB 8 BYTE WRITES, INITIAL WRITE WITH MAX DELAY, SUBSEQUENT WRITES NO DELAYS   | 144 |
| FIGURE 8-79 8 BYTE WRITE, INITIAL MAX DELAY, SUBSEQUENT MAX DELAY (PART 1)                  | 145 |
| FIGURE 8-80 8 BYTE WRITE, INITIAL MAX DELAY, SUBSEQUENT MAX DELAY (PART 2)                  | 145 |
| FIGURE 8-81 BTOB 8 BYTE WRITES, INITIAL WRITE WITH DELAY, SUBSEQUENT WRITES DELAYS (PART 1) | 146 |
| FIGURE 8-82 BTOB 8 BYTE WRITES, INITIAL WRITE WITH DELAY, SUBSEQUENT WRITES DELAYS(PART 2)  | 146 |
| FIGURE 8-83 16 BYTE WRITES, INITIAL MAX DELAY, NO SUBSEQUENT DELAY                          | 147 |
| FIGURE 8-84 16 BYTE WRITES, INITIAL MAX DELAY, SUBSEQUENT MAX DELAYS                        | 147 |
| FIGURE 8-85 32 BYTE WRITES, INITIAL MAX DELAY, NO SUBSEQUENT DELAY                          | 148 |
| FIGURE 8-86 32 BYTE WRITES, INITIAL MAX DELAY, SUBSEQUENT MAX DELAYS                        | 148 |
| FIGURE 8-87 THREE 8 BYTE WRITES, 8 BYTE READ, 40 BYTE READ, THREE 8 BYTE WRITES (PART 1)    | 149 |
| FIGURE 8-88 THREE 8 BYTE WRITES, 8 BYTE READ, 40 BYTE READ, THREE 8 BYTE WRITES (PART 2)    | 149 |
| FIGURE 8-89 16 BYTE READ, 8 BYTE READ, 16 BYTE READ                                         | 150 |
| FIGURE 8-90 BACK TO BACK 32 BYTE READS                                                      | 150 |
| FIGURE 8-91 BACK TO BACK 40 BYTE READS                                                      | 151 |

ix

## List of Tables

| TABLE 3-1 A.G.P. BUS COMMANDS                        | 12 |
|------------------------------------------------------|----|
| TABLE 3-2 SIDEBAND ADDRESS PORT ENCODING             | 14 |
| TABLE 3-3 A.G.P. FLOW CONTROL POINTS                 | 21 |
| TABLE 3-4 DATA BUFFERING FOR 2X TRANSFERS            | 23 |
| TABLE 3-5 A.G.P. SIGNALS REQUIRED                    | 28 |
| TABLE 3-6 A.G.P. ADDRESSING                          | 29 |
| TABLE 3-7 A.G.P. FLOW CONTROL                        | 29 |
| TABLE 3-8 A.G.P. STATUS SIGNALS                      | 30 |
| TABLE 3-9 A.G.P. CLOCK LIST                          | 31 |
| TABLE 3-10 PCI SIGNALS IN RELATION TO A.G.P.         | 32 |
| TABLE 3-11 A.G.P. ARBITRATION RULES                  | 52 |
| TABLE 3-12 GNT# DURATION                             | 53 |
| TABLE 3-13 CURRENT/NEXT AD ACTIVITY                  | 56 |
| TABLE 4-1: DC SPECIFICATIONS FOR A.G.P. 1X SIGNALING | 70 |
| TABLE 4-2: DC SPECIFICATIONS FOR A.G.P. 2X SIGNALING | 71 |
| TABLE 4-3: A.G.P. 1X AC TIMING PARAMETERS            | 72 |
| TABLE 4-4 A.G.P. 2X AC TIMING PARAMETERS             | 74 |
| TABLE 4-5 MEASUREMENT AND TEST CONDITION PARAMETERS  | 77 |
| TABLE 4-6 SIGNAL INTEGRITY REQUIREMENTS <sup>1</sup> | 78 |
| TABLE 4-7: SYSTEM TIMING SUMMARY                     | 82 |
| TABLE 4-8: INTERCONNECT DELAY SUMMARY                | 83 |
| TABLE 4-9 CLOCK SKEW PARAMETERS                      | 84 |
| TABLE 4-10: MOTHERBOARD INTERCONNECT DELAYS          | 85 |
| TABLE 4-11: PULL-UP RESISTOR VALUES                  | 86 |
| TABLE 4-12: ADD-IN CARD INTERCONNECT DELAYS          | 87 |

## **Revision History**

| Revision | Revision History         | Date    |
|----------|--------------------------|---------|
| 1.0      | Initial Released Version | 7/31/96 |

## 1. Introduction

The Accelerated Graphics Port (AGP or A.G.P.) is a high performance, component level interconnect targeted at 3D graphical display applications and is based on a set of performance extensions or enhancements to PCI. This document specifies the A.G.P. interface, and provides some design suggestions for effectively using it in high performance 3D graphics display applications.

## 1.1 Motivation.

In general, 3D rendering has a voracious appetite for memory bandwidth, and continues to put upward pressure on memory footprint as well. As 3D hardware and software become more pervasive, these two trends are likely to accelerate, requiring high speed access to ever larger amounts of memory, thus raising the bill of material costs for 3D enabled platforms. Containing these costs while enabling performance improvements is the primary motivation for the A.G.P.. By providing up to an order of magnitude bandwidth improvement between the graphics accelerator and system memory, some of the 3D rendering data structures may be effectively shifted into main memory, relieving the pressure to increase the cost of the local graphics memory.

Texture data are the first structures targeted for shifting to system memory for four reasons:

- 1. Textures are generally read only, and therefore do not have special access ordering or coherency problems.
- 2. Shifting textures balances the bandwidth load between system memory and local graphics memory, since a well cached host processor has much lower memory bandwidth requirements than does a 3D rendering engine. Texture access comprises perhaps the largest single component of rendering memory bandwidth (compared with rendering, display and Z buffers), so avoiding loading or caching textures in graphics local memory saves not only this component of local memory bandwidth, but also the bandwidth necessary to load the texture store in the first place. Furthermore, this data must pass through main memory anyway as it is loaded from a mass store device..
- 3. Texture size is dependent upon application quality rather than on display resolution, and therefore subject to the greatest pressure for growth.
- 4. Texture data is not persistent; it resides in memory only for the duration of the application, so any system memory spent on texture storage can be returned to the free memory heap when the application concludes (unlike display buffers which remain in persistent use).

Other data structures may be moved to main memory but texture data is the biggest win.

Reducing costs by moving graphics data to main memory is the primary motivation for the A.G.P., which is designed to provide a smooth, incremental transition for today's PCI based graphics vendors as they develop higher performance components in the future.

## 1.2 Relationship to PCI

The A.G.P. interface specification uses the 66 MHz PCI (*Revision 2.1*) specification as an operational baseline, and provides three significant performance extensions or enhancements to the PCI specification which are intended to optimize the A.G.P. for high performance 3D graphics applications. These A.G.P. extensions are NOT described in, or required by, the PCI specification (*Rev. 2.1*). These extensions are:

- Deeply pipelined memory read and write operations, fully hiding memory access latency.
- Demultiplexing of address and data on the bus, allowing almost 100% bus efficiency.
- AC timing for 133 MHz data transfer rates, allowing for real data throughput in excess of 500 MB/sec..

These enhancements are realized through the use of "sideband" signals. The PCI Specification has not been modified in any way, and the A.G.P. interface specification has specifically avoided the use of any of the "reserved" fields, encodings, pins, etc. in the PCI Specification. The intent is to utilize the PCI design base, while providing a range of graphics-oriented performance enhancements with varying complexity/performance tradeoffs available to the component provider.

A.G.P. neither replaces nor diminishes the necessity of PCI in the system. This high speed port (A.G.P.) is physically, logically, and electrically independent of the PCI bus. It is an additional connection point in the system, as shown in Figure 1-1. It is intended for the exclusive use of visual display devices; all other I/O devices will remain on the PCI bus. The add-in slot defined for A.G.P. uses a new connector body (for electrical signaling reasons) which is not compatible with the PCI connector; PCI and A.G.P. boards are **not** mechanically interchangeable.

The A.G.P. interface specification was developed by Intel, independent of the PCI Special Interest Group, and has been neither reviewed nor endorsed by that group. It is intended to encourage innovation in personal computer graphics technology and products.



Figure 1-1 System Block Diagram: A.G.P. and PCI Relationship

## 2. Architectural Context and Scope

This chapter is intended to provide an architectural context for the actual specification of the Accelerated Graphics Port (A.G.P. or AGP) interface, hopefully motivating the design, and making details easier to understand. As such, this chapter does not define any interface requirements, and therefore is not necessary for implementing the A.G.P. interfaces or A.G.P. compliant parts or systems. It also is intended to set the technical scope of the specification in some areas by providing examples of issues beyond the purview of the formal interface specification.

## 2.1 Two Usage Models: "Execute" & "DMA"

There are two primary A.G.P. usage models for 3D rendering, that have to do with how data are partitioned and accessed, and the resultant interface data flow characteristics. In the "DMA" model, the primary graphics memory is the local memory associated with the accelerator, referred to as 'local frame buffer'. 3D structures are stored in system memory, but are not used (or "executed") directly from this memory; rather they are copied to primary (local) memory, to which the rendering engine's address generator makes it's references. This implies that the traffic on the A.G.P. tends to be long, sequential transfers, serving the purpose of bulk data transport from system memory to primary graphics (local) memory. This sort of access model is amenable to a linked list of physical addresses provided by software (similar to operation of a disk or network I/O device), and is generally not sensitive to a non-contiguous view of the memory space.

In the "execute" model, the accelerator uses both the local memory and the system memory as primary graphics memory. From the accelerator's perspective, the two memory systems are logically equivalent; any data structure may be allocated in either memory, with performance optimization as the only criteria for selection. In general, structures in system memory space are not copied into the local memory prior to use by the accelerator, but are "executed" in place. This implies that the traffic on the A.G.P. tends to be short, random accesses, which are not amenable to an access model based on software resolved lists of physical addresses. Since the accelerator generates direct references into system memory, a contiguous view of that space is essential. But, since system memory is dynamically allocated in random 4K pages, it is necessary in the "execute" model to provide an address mapping mechanism that maps random 4K pages into a single contiguous, physical address space.

The A.G.P. supports both the "DMA" and "execute" models. However, since a primary motivation of the A.G.P. is to reduce growth pressure on local memory, the "execute" model is the design center. Consistent with that emphasis, this interface specification requires a physical-to-physical address re-mapping mechanism which insures the graphics accelerator (A.G.P. master) will have a contiguous view of graphics data structures dynamically allocated in system memory.

July 31, 1996



Figure 2-1 Graphics Address Re-mapping Function

This address re-mapping applies only to a single, programmable range of the system physical address space, as shown in Figure 2-1. The 32-bit physical address space shown is common to all system agents. Addresses falling in this range are re-mapped to non-contiguous pages of physical system memory. All addresses not in this range are passed through without modification, and map directly to main system memory, or to device specific ranges, such as the graphics local frame buffer memory shown here.

Re-mapping is accomplished via a memory-based table called the Graphics Address Re-mapping Table (GART), which is set up and maintained by the mapping API described in chapter 6, and used ("walked") by the core logic to perform the re-mapping. In order to avoid compatibility issues and allow future implementation flexibility, this mechanism is specified at a software (API) level. In other words, the actual GART table format is not specified; rather it is abstracted to the API by a HAL or mini-port driver that must be provided with the core logic. While this API does not constrain the future partitioning of re-mapping hardware, the re-mapping function will initially be implemented in the chipset or core logic. *Note: this re-mapping function should not be confused in any way with the system address translation table mechanism. While some of the concepts are similar, these are completely separate mechanisms which operate independently, under control of the operating system.* 

## 2.2 A.G.P. Queuing Models

Both A.G.P. bus transactions and PCI bus transactions may be run over the A.G.P. interface. An A.G.P. compliant device may transfer data to system memory using either A.G.P. transactions or PCI transactions. The core logic can access the A.G.P. compliant master (graphics) device only with PCI transactions. Traffic on the A.G.P. interface may consist of a mixture of interleaved A.G.P. and PCI transactions.

The access request and data queue structures are illustrated in Figure 2-2.



Figure 2-2 A.G.P. Access Queuing Model

A.G.P. transactions are run in a split transaction fashion where the request for data transfer is "disconnected" from the data transfer itself. The A.G.P. compliant device initiates an A.G.P. transaction with an access request. The core logic responds to the access request by directing the corresponding data transfer at a later time. The fact that the access requests are separated from the data transfers allows the A.G.P. compliant device to issue several access requests in a pipelined fashion while waiting for the data transfers to occur. Pipelining access requests results in having several read and/or write requests outstanding in the core logic's *request queue* at any point in time. The request queue is divided into high priority and low priority sub-queues, each of which deal with respective accesses according to separate priority and ordering rules. The A.G.P. compliant device tracks the state of the request queue in order to limit the number of outstanding requests and identify data transactions.

The core logic processes the access requests present in its request queue. Read data will be obtained from system memory and returned at the core chipset's initiative via the *read data return queue*. Write data will be provided by the A.G.P. device at the core logic's direction when space is available in the core logic's *write data queue*. Therefore A.G.P. transaction traffic will generally consist of interleaved access requests and data transfers.

All PCI transactions on the A.G.P. also have their own queues, separate from the A.G.P. transaction queues. Each queue has its own access and ordering rules. Not shown in figure 2-2 is the core logic queue which handles processor accesses directly to the A.G.P. compliant device, all of which are executed as nonpipelined PCI bus transactions.

July 31, 1996

## 2.3 Performance Considerations

Deep pipelining capability allows the A.G.P. to achieve a total memory READ throughput equal to that possible for memory WRITES<sup>1</sup>. This capability, coupled with optional higher transfer rates and address demultiplexing allows a full order of magnitude increase in memory read throughput over today's PCI implementations. However, many typical desktop platforms will not have sufficient total memory performance to allow full utilization of the A.G.P. capabilities, and ultimately platform issues are likely to predominate in the upper performance limit deliverable through the A.G.P. This makes it very difficult to provide as part of this interface specification a single set of performance guarantees or targets the graphics designers can depend upon. It is clear that in order to optimize a graphics design for the most effective use of the A.G.P., it will need to be targeted at a specific subset of platforms and/or core logic devices.

In an attempt to provide the best possible information for such an optimization, this interface specification defines a few common performance parameters that may be of general interest, and recommends that core logic vendors and/or OEMs provide these parameters for their systems to respective graphics IHVs.

The following are the basic parameters that each core logic set and/or system implementation should provide as a performance baseline for IHVs targeting that platform.

- **Guaranteed Latency**: a useable worst case A.G.P. memory access latency via the HIGH PRIORITY QUEUE, as measured from the clock on which the request (**REQ#**) signal is asserted until the first clock of data transfer. Assumptions: no outstanding A.G.P. requests (pipeline empty); no wait states or control flow asserted by the graphics master master is ready to transfer data on any clock (inserting *n* clocks of control flow may delay response by more than *n* clocks).
- **Typical Latency**: the typical A.G.P. memory access latency via the LOW PRIORITY QUEUE, as measured from the clock on which the request (**REQ#**) signal is asserted until the first clock of data transfer. Assumptions: no outstanding A.G.P. requests (pipeline empty); no wait states or control flow asserted by the graphics master master is ready to transfer data on any clock (inserting *n* clocks of control flow may delay response by more than *n* clocks).
- **Mean bandwidth**: deliverable A.G.P. memory bandwidth via the LOW PRIORITY QUEUE, averaged across ~10mS (one frame display time). Assumptions: no accesses to the high priority queue; graphics master maintains optimal pipeline depth of <u>x</u>; average access length of <u>y</u>; no wait states or control flow asserted by the graphics master.

## 2.4 Platform Dependencies

Due to the close coupling of the A.G.P. and main memory subsystem, there are several behaviors of the A.G.P. that will likely end up being platform dependent. While the objective of any specification is to minimize such differences, it is apparent that several are probable among A.G.P. compliant core logic and platform implementations. This should not, however, have as much impact as it would in other buses for two reasons:

<sup>&</sup>lt;sup>1</sup> Memory read throughput on PCI is about half of memory write throughput, since memory read access time is visible as wait states on this unpipelined bus.

- The A.G.P. is a point-to-point connection, intended for use by a 3D graphics accelerator only, and,
- Due to performance issues (section 2.3), A.G.P. compliant graphics devices will likely need to be optimized to a specific subset of platform or core logic implementations anyway.

The purpose of this section is to identify by example some of the areas where behavioral differences are likely, and accordingly establish the scope of this interface specification.

As one example of potential variation, consider the two core logic architectures shown in Figure 2-3. An integrated approach, typical of desktop and volume systems, is shown on the left, and a symmetric multiprocessor partitioning, typical of MP servers, is shown on the right.



Figure 2-3 Different Core Logic Architectures

The following items are examples of areas where behavioral differences between these or other implementations could well occur.

- **GART Implementation**: for various reasons, different systems may opt for different GART table implementations and layouts. This, however, is not visible since the actual table implementation is abstracted to a common API by the HAL or mini-port driver supplied with the core logic.
- **Coherency with processor cache**: due to the high potential access rate on the A.G.P., it is not advisable from a performance perspective to snoop all accesses. While an attribute in the Graphics Address Re-mapping Table could indicated pages in which snoops were needed, doing this selectively forces a split and later re-combining of the access queue in the integrated chipset, which is very costly with marginal benefit (there are reasonable software solutions to this problem). On the other hand, the MP bus in the MP system could well deal with selective snoops very easily. As a result, processor cache snooping may be chipset dependent. This interface specification does not require snooping, and in general it must not be counted on. Note that all PCI bus transactions are by definition coherent, and always snoop the processor cache.
- **Bus-to-bus traffic capability**: bus masters on either the A.G.P. or the PCI bus will routinely access system memory. However, it is possible to also address targets on the other bus or port, which effectively requires a PCI-to-PCI bridge in the integrated chipset. Pushing WRITES through this bridge is fairly simple, whereas pushing READS through requires a complete bridge implementation, and it is not clear this would ever be utilized. Therefore, this interface specification requires

support for WRITES between agents on PCI and the A.G.P.<sup>2</sup>, but does not require support for READS between the A.G.P. and PCI, and READS should, in general, not be assumed to work.

- Address re-mapping support for PCI compliant bus master accesses to memory: there could be situations where it would be convenient for memory accesses from a PCI bus compliant master to use the GART range address re-mapping services provided to the A.G.P. compliant graphics device. One example, would be a capture device writing to a tiled or non-linear graphics surface allocated in system memory<sup>3</sup>. While this works very naturally in the integrated (desktop) chipset solution, it may well require redundant mapping hardware in the MP solution. Therefore, this interface specification recommends, but does not require address re-mapping support for accesses from a PCI compliant bus master. In the case where re-mapping is not provided for PCI compliant masters, an address in the GART address range must be deemed an access error, and dealt with consistent with the error handling policies of that platform. In *NO* case may an address falling in the GART addresses must be EITHER re-mapped, OR trapped and faulted. Note: while no tangible requirement for PCI bus re-mapping support exists today, it may be dangerous to fault physical addresses, and this solution, while possible, is NOT recommended.
- **Performance**: as already discussed in section 2.3, a variety of performance parameters will likely have chipset and/or platform dependencies.

These are examples of platform dependencies that have been determined to fall outside the scope of this interface specification. In general, the scope of this interface specification is limited to the electrical and logical behavior of the actual A.G.P. interface signals, the mechanical definition of an A.G.P. compliant add-in board, the newly defined A.G.P. configuration registers, and software API which controls the graphics address re-mapping function.

 $<sup>^{2}</sup>$  By way of example, this allows for a video stream generator (e.g., capture, decode, etc.) on PCI to WRITE to the graphics frame buffer on the A.G.P.

<sup>&</sup>lt;sup>3</sup> Note that there is currently no example of this kind of I/O and buffer management. Today a capture device writes linearly to a linked list of discontiguous pages.

## 3. Signals and Protocol Specification

## 3.1 A.G.P. Operation Overview

Memory access pipelining is the major PCI protocol enhancement provided under the Accelerated Graphics Port (A.G.P. or A.G.P.) Interface Specification. A.G.P. pipelined bus transactions share most of the PCI signal set, and are actually interleaved with PCI transactions on the bus. Only memory read and write bus operations targeted at main memory can be pipelined; all other bus operations, including those targeted at device-local memories (e.g., frame buffers), are executed as PCI transactions, as defined in the PCI Rev. 2.1 Specification.

A.G.P. pipelined operation allows for a single *A.G.P. compliant target*, which must always be the system memory controller, referred to in this document as *core logic*. In addition to A.G.P. compliant target functions, the core logic must also implement a complete PCI sequencer<sup>4</sup>, both master and target. For electrical signaling reasons the A.G.P. is defined as a point-to-point connection, therefore there is also a single *A.G.P. compliant master*, which, in addition to implementing the A.G.P. compliant master functions, must also provide full PCI compliant target functionality<sup>5</sup> - PCI compliant master functionality is optional.

#### 3.1.1 Pipeline Operation

The A.G.P. interface is comprised of a few newly defined "sideband" control signals which are used in conjunction with the PCI signal set. A.G.P.-defined protocols (e.g., pipelining) are overlaid on the PCI bus at a time and in a manner that a PCI bus agent (non-A.G.P.) would view the bus as idle. Both pipelined access requests (read or write) and resultant data transfers are handled in this manner. The A.G.P. interface uses both PCI bus transactions without change, as well as A.G.P. pipelined transactions as defined herein. Both of these classes of transactions are interleaved on the same physical connection. The access request portion of an A.G.P. transaction (bus command, address and length) is signaled differently than is a PCI address phase. The information is still transferred on the **AD** and **C/BE#** signals of the bus as is the case with PCI<sup>6</sup>, but is identified or framed with a new control signal, **PIPE#**, in a similar way to which PCI address phases are identified with **FRAME#**.

The maximum depth of the A.G.P. pipeline is not architecturally constrained but is set to a maximum of 256 by this interface specification. However, the maximum A.G.P. pipeline depth may be reduced further by the capabilities of both master and target. The target provides an implementation dependent number of pipe slots, which is identified at configuration time and made known to the bus master (see section 6.3). The pipeline is then source throttled, since a master is never allowed to have more outstanding requests than the number of pipe slots it has been allocated.

<sup>&</sup>lt;sup>4</sup> The A.G.P. Target must behave as a PCI 2.1 compliant Master and Target with one exception. The A.G.P. Target is not required to adhere to the target initial latency requirements as stated in the PCI 2.1 specification.

<sup>&</sup>lt;sup>5</sup> The A.G.P. Master must function as a 2.1 compliant PCI target.

<sup>&</sup>lt;sup>6</sup> Note that there are some optional mechanisms that allow address demultiplexing - using an alternate (non-**AD** bus) mechanism for transferring address - which is described in section 3.1.3.

July 31, 1996

The notion of intervening in a pipelined transfer enables the bus master to maintain the pipe depth by inserting new requests between data replies. This notion is similar to the software notion of a subroutine call; one context (data reply) running on the physical resource (wires in this case) is temporarily suspended while another context (request queuing) runs, after which the first context is restored and processing continues. This bus sequencing is illustrated in Figure 3-1 below.



Figure 3-1 Basic A.G.P. Pipeline Concept.

When the bus is in an idle condition, the pipe can be started by inserting one or more A.G.P. access requests consecutively. Once the data reply to those accesses starts, that stream can be broken (or intervened) by the bus master (e.g., graphics controller) to:

- Insert one or more additional A.G.P. access requests.
- Insert a PCI transaction.

This intervene is accomplished with the bus ownership signals, REQ# and GNT#.

Operation of the bus can also be understood in terms of the 4 bus states shown in Figure 3-2. The operation of the PCI bus can be described by the two states *PCI* and *IDLE*, and the transition lines directly connecting them.



Figure 3-2 A.G.P./PCI Operational States

July 31, 1996

The A.G.P. pipeline is initiated from the idle state by arbitrating for the bus, and delivering one or more A.G.P. access requests (A.G.P. state). These requests are transmitted much like a PCI address phase except that they are timed with **PIPE#** rather than **FRAME#**. When one or more addresses has been transmitted, and **PIPE#** is deasserted, the bus enters the DATA state, in which the core logic (sole A.G.P. compliant target) controls the **AD** lines, and transfers data. If a bus master then requests the bus (**REQ#**) the A.G.P. compliant arbiter (always located in A.G.P. compliant target / core logic) suspends pipelined data transfer, and, using the **GNT#** signals, allows the bus master to initiate a bus transaction, driving the bus to either the A.G.P. or the PCI state depending on whether the master asserts **PIPE#** or **FRAME#**. After this transaction is complete, the bus returns to the DATA state and resumes the pipelined transfer. Pipelined data flow may be suspended ONLY at transaction boundaries, never in the middle of a single transaction. While the return of data is pending (a request for data has not been completed) the state machine remains in the DATA state. If new request need to be enqueued while data is pending, the machine transitions from DATA to A.G.P. or PCI depending on what type of request is initiated. Only when all data has been transferred that was previously request, does the machine return to the IDLE condition. For mobile designs, the clock is not allowed to be stopped or changed except when the bus has returned to the Idle state, which means that there are no outstanding requests pending. Flow control issues and protocol for pipelined A.G.P. requests and data transfer are discussed in section 3.2.5

## 3.1.2 Addressing Modes and Bus Operations

A.G.P. transactions differ from PCI transactions in several important ways.

- 1. The data transfer in A.G.P. transactions (both reads and writes) is "disconnected" from its associated access request. I.e., request and associated data may be separated by other A.G.P. operations, whereas a PCI data phase is connected to its associated address phase with no possibility of intervening operations. This separation not only allows the pipe depth to be maintained, but also allows the core logic to insure a sufficiently large buffer is available for receiving the write data before tying up the bus on a data transfer that otherwise could be blocked awaiting buffer space. Note that all of the access ordering rules on A.G.P. are based on the arrival order of the access requests, and not the order of actual data transfer.
- 2. A.G.P. transactions use a completely different set of bus commands (defined below) than do PCI transactions. A.G.P. bus commands provide for access ONLY to main system memory. (PCI bus commands provide for access to multiple address spaces: memory, I/O, configuration.) The address space used by A.G.P. commands is the same 32-bit, linear physical space also used by PCI memory space commands,<sup>7</sup> as well as on the processor bus.
- 3. Memory addresses used in A.G.P. transactions are always aligned on 8-byte boundaries; 8 bytes is the minimum access size, and all accesses are integer multiples of 8 bytes in length.<sup>8</sup> (Memory accesses for PCI transactions have 4-byte granularity, aligned on 4-byte boundaries.) Smaller or odd size reads must be accomplished with PCI read transaction. Smaller or odd size writes may be accomplished via the **C/BE#** signals, which enable the actual writing of individual bytes within an eight byte field.

<sup>&</sup>lt;sup>7</sup> This physical memory space may contain a GART range, within which addresses are translated per the description in section 2.1.

<sup>&</sup>lt;sup>8</sup> The motivation for increasing the addressing granularity from 4 bytes (PCI) to 8 bytes is tied to the typical memory organization used with 64-bit processors. Memories for these systems will generally be 64 bits wide. Therefore, smaller accesses will not provide any performance savings at the memory, and the motivation of A.G.P. is centered around maximizing memory performance for media controllers.

- 4. A.G.P. access requests have an explicitly defined access length or size. (PCI transfer lengths are defined by the duration of **FRAME#**.)
- 5. A.G.P. accesses do not guarantee memory coherency. I.e., A.G.P. accesses are not required to be snooped in the processor cache. PCI memory accesses always insure a coherent view of memory, and must be used on accesses where coherency is required.

The format of a complete A.G.P. bus request is shown in Figure 3-3.



#### Figure 3-3 Layout of an A.G.P. Access Request

The '*LLL*' field contains the access length in units of Q-words (8 bytes), and displaces the low order 3 bits of address. A length field of "000" indicates that a single Q-word (8 bytes) of data is being requested, while "111" indicates 8 Q-words (64 bytes) are being requested. The 'CCCC' field contains the bus operation or command as itemized in Table 3-1.

| 0000 | Read                    |
|------|-------------------------|
| 0001 | Read (hi-priority)      |
| 0010 | reserved                |
| 0011 | reserved                |
| 0100 | Write                   |
| 0101 | Write (hi-priority)     |
| 0110 | reserved                |
| 0111 | reserved                |
| 1000 | Long Read               |
| 1001 | Long Read (hi-priority) |
| 1010 | Flush                   |
| 1011 | reserved                |
| 1100 | Fence                   |
| 1101 | reserved                |
| 1110 | reserved                |
| 1111 | reserved                |

#### CCCC A.G.P. Operation

 Table 3-1 A.G.P. Bus Commands

Command definitions are as follows.

*Read:* starting at the specified address, read *n* sequential Q-words, where  $n = (\text{length}_{\text{field}} + 1)$ .

*Read (hi-priority):* same as *Read*, but the request is queued in the high priority queue. The reply data is returned out of order and within the maximum latency window established for high priority accesses (seesection 2.3). High priority accesses only follow A.G.P. ordering rules with respect to other high priority read accesses.

- *Write:* starting at the specified address, write *n* sequential Q-words, as enabled by the **C/BE#** bits, where  $n = (\text{length}_{\text{field}} + 1)$ . Writes obey the bus ordering rules (they may be retired ahead of previously issued reads).
- *Write (hi-priority):* same as *Write*, but indicates that the write data must be transferred from the master within the maximum latency window established for high priority accesses<sup>9</sup> (see section 2.3). High priority write accesses only follow A.G.P. ordering rules with respect to other high write priority accesses.
- *Long Read:* same as *Read* except for access size; in this case  $n = 4 * (length_field + 1)$ , allowing for up to 256 byte transfers.
- Long Read (hi-priority): same as Read (hi-priority) except for access size which is the same as for Long Read.
- *Flush:* similar to read. This command drives all low priority write accesses ahead of it to the point that all the results are fully visible to all other system agents, and then returns a single Q-Word of random data as an indication of its completion (see section 3.2.3). The address and length fields are meaningless for this command.
- *Fence:* creates a boundary in a single master's access stream, around which writes may not pass reads (see section 3.2.3). This command is the only one which does NOT occupy a slot in the A.G.P. pipeline.

*Reserved:* Must not be issued by a master and may be defined in the future by Intel.

#### 3.1.3 Address Demultiplexing Option

Ideally, to maximize efficiency and throughput on a random access memory bus such as A.G.P., the address is demultiplexed (separate address pins) from the data pins. The A.G.P. interface provides an optional sideband signal set to do this (**SBA[7::0]**), referred to as the *sideband address port*. However, in order to keep pin cost down it is only an 8-bit wide interface. Implementation of the sideband address port is optional for an A.G.P. compliant master. However, an A.G.P. compliant target (all core logic products) is required to fully support sideband address port operation. This means that a master wishing to use this A.G.P. feature may always depend on target support. Software can query the master and determine if the SBA port will be used by the master or not. When used by the master, software will enable the target to accept requests using the SBA port. See section 6.1 for a description of the SBA status and enable bits.

The sideband address port is used exclusively to transmit A.G.P. access requests (all PCI transactions use the **AD** pins for both data and address), and therefore it is always driven in one direction; from master to target. The semantics of an A.G.P. request transmitted via **AD** pins or **SBA** pins are identical; only the actual syntax of transmission varies. The **SBA** and **AD** pins are never used in any combination to transmit requests; in any given configuration, all A.G.P. requests are transmitted either on **AD** pins or **SBA** pins. A master which uses the sideband address port, has no need of the **PIPE#** signal, which is used only to frame requests on the **AD** pins.

In order to transmit the complete A.G.P. access request across the 8-wire **SBA** port, the request is broken into 3 parts: low order address bits and length, mid-order address bits and command, and high order address bits. These three parts are referred as Type 1, Type 2 and Type 3 respectively. The registers for the last two parts (mid-order address bits (Type 2) and high order address bits (Type 3)) are 'sticky'. Where *sticky* 

<sup>&</sup>lt;sup>9</sup> This implies that if the target write queue is full, some access priority must be raised in order to accommodate this access within the latency requirement.

refers to the attribute where they retain what was last loaded into them, so these two parts of the request need only be transmitted if they have changed since the previous request. This exploits the potential locality in the access request stream to minimize the address traffic over the 8 **SBA** signals.

The transmission of each of these 3 Types is accomplished by a separate **SBA** operation. Each operation on the sideband address port delivers a total of 16 logical bits, in 2 phases or transfer ticks of 8 bits each. In 1x transfer mode, each **SBA** operation requires two A.G.P. clocks; while in the 2x transfer mode (source clocked option is active), the entire transfer completes in one A.G.P. clock (See 3.5.1.2.) The **SBA** pins always operate at the same transfer rate as the **AD** pins; either 1x or 2x transfer mode as initialized in the A.G.P. command register (see section 0.) This relationship keeps the 8-bit sideband address port well matched in speed with data transfer on the **AD** pins, since the minimum data transfer size is 8 bytes (2 **AD** ticks), and most A.G.P. access requests will only involve the low order address bits and length, requiring a single **SBA** operation (2 **SBA** ticks).

Table 3-2 below shows the definition and encoding of each of the sideband address port operations. Bold underlines in the encoding column indicate op codes. Each opcode requires two data transfers to move the entire 16 bits to the A.G.P. compliant target. Note that the first piece of data transferred includes the op code. For example for the Length and Lower Address Bits encoding which has an opcode is 0. In this encoding, the first data transferred is the op code (0) and address bits 14 - 08. The second piece of data transferred is address bits 7-3 and the 3 length encoded bits.

Table 3-2 Sideband Address Port Encoding

| Encoding                                             | Description                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |  |  |  |  |
|------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--|--|--|--|
| S <sub>7</sub> S <sub>0</sub>                        | Shows alignment of messages on physical sideband wires.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |  |  |  |  |
| <u>1111</u> <u>1111</u><br>[1111 <u>1111]</u>        | <b>Bus Idle:</b> used to indicate the bus is idle, also referred to as a NOP. When running at 1x transfer mode, this command is limited to a single clock tick of 8 bits (all ones) while 2x transfer mode requires the full 16 bits as shown here.                                                                                                                                                                                                                                                                                                                 |  |  |  |  |
| <b>O</b> AAA AAAA<br>14 08<br>AAAA ALLL<br>07 03     | <i>Length &amp; Lower Address Bits:</i> the A.G.P. access length field (LLL), and lower 12 address bits (A[14::03]) are transferred across the sideband address port, and a memory access is initiated. The encoding is also referred to as a Type 1 sideband command. The remainder of the A.G.P. access request (A[31::15] and bus command) is defined by what was last transmitted using the other two sideband address port commands (Type 2 and Type 3. Note that AD[2::0] are assumed to be zero when using this encoding and these bits are not transferred. |  |  |  |  |
| <b><u>10</u></b> CC CC-A<br>15<br>AAAA AAAA<br>23 16 | <i>Command &amp; Mid Address Bits:</i> the A.G.P. bus command ( <b>CCCC</b> ) and mid-order 9 address bits ( <b>A[23::15]</b> ) are transferred across the sideband address port; no memory access is initiated. This encoding is also referred to as a Type 2 sideband command. This command, when followed by the previous command (Type 1) provides for memory access anywhere within a naturally aligned 16 MB 'page'.                                                                                                                                          |  |  |  |  |
| 110- AAAA<br>35 32<br>AAAA AAAA<br>31 24             | <i>Upper Address Bits:</i> the upper 12 address bits ( <b>A</b> [35::24]) are transferred across the sideband address port; no memory access is initiated. This encoding is also referred to as a Type 3 sideband command. This command, when followed by the two previous commands (Type 2 and Type 1) provides for memory access anywhere within a 32-bit physical address space. The extra four bits ( <b>A</b> [35::32]) are place holders to avoid aliasing problems in the face of possible address expansion.                                                |  |  |  |  |
| <u>1110</u> ****<br>**** ****                        | <i>reserved:</i> must not be issued by an A.G.P. compliant master and maybe defined by Intel in the future.                                                                                                                                                                                                                                                                                                                                                                                                                                                         |  |  |  |  |

Note that only one sideband address port command (Type 1) actually initiates a memory cycle; the other two (Type 2 and Type 3) simply update respective 'sticky' bits. There is no restriction on the relative ordering in which Type 1, 2 or 3 commands can be issued by the master. If a memory cycle is initiated prior to the initial setting of all access request 'sticky' bits, those bits will be treated as indeterminate. For example, if the first command issued after the port is enabled is a Type 1 command (Type 2 or Type 3 have not occurred yet), the A.G.P. compliant target may use an indeterminate address bits (A15- A31) and command (C3-C0) to access memory. The master is allowed to issue Type 1, 2 or 3 commands in any order and memory accesses are queued anytime a Type 1 is issued. The A.G.P. compliant target receives a Type 1 command it takes the Type 1 information and combines it with previously stored Type 2 and Type 3 information to reconstruct a full address, command and length information to initiate a memory access.

The sideband address port has no associated control or framing signals; command framing is content sensitive (similar to serial interconnects). That is, the port encoding signifies whether there is valid information on the port. A NOP encoding (all 1's) indicates the port is idle and no action is initiated by the master. NOP must be continually transmitted when the port is not in use. The Type 2 and 3 target registers

are not affected while NOPs appear on the SBA interface. Since all **SBA** operations must start with the rising edge of the A.G.P. clock, the port idle encoding is 8-bits long in 1x transfer mode, and 16-bits long in 2x transfer mode.

## 3.2 Access Ordering Rules & Flow Control

#### 3.2.1 Ordering Rules and Implications

This section discusses the ordering relationships between A.G.P. and non-A.G.P. transactions initiated on the A.G.P. interface, between different streams of A.G.P. transactions, between A.G.P. transactions and other system operations (CPU and PCI). These rules apply to operations generated by an A.G.P. compliant Master (and completed by a A.G.P. compliant target). Note that the following rules do not apply to High Priority operations, which will be discussed in section 3.2.4.

#### A.G.P. Compliant System Ordering Rule:

There is no ordering relationship between an A.G.P. Compliant Master's operation and any other system operation, including operations generated by host CPU(s), PCI agents, or expansion bus agents.

This means that A.G.P. transactions are only required to follow A.G.P. ordering rules, even when A.G.P. transactions cross into other domains. For example, an A.G.P. compliant master read to a location in memory that is currently locked by the processor is not required to adhere to the processors lock. It is allowed to complete in the face of the lock and a programming error has occurred if this causes an incorrect operation. A.G.P. compliant hardware is not required to ensure consistent data when A.G.P. transactions interact with the rest of the system. For a read this means that the A.G.P. compliant hardware is allowed to get a copy from main memory and is not required to obtain a more recent copy (if available) from the CPU cache. For a write, the A.G.P. compliant hardware can simply write the data to main memory without snooping the CPU cache. If the cache has a modified line, it will over write the A.G.P. domain to memory. When an A.G.P. compliant master needs to cause a synchronization event (request an interrupt or set a flag) to occur it must use the FLUSH command to guarantee that previous A.G.P. write operations become visible to the rest of the system. See section 3.2.3 for details.

#### A.G.P. Compliant Device Ordering Rules

- 1) The is no ordering relationship between an A.G.P. operation and a PCI transaction.
- 2) The A.G.P. compliant Target will return a stream of A.G.P. Read data in the same order as requested.

Example: Reads requested in the order A, B, C, D will return data in the same order as requested A, B, C, D.

3) A.G.P. Write operations are processed by the A.G.P. compliant Target in the order they are requested.

Example: Writes requested in the order A, B where A and B overlap will cause B to overwrite part of A.

4) Read data returned will be coherent with previously issued A.G.P. Write requests.

Example: Requests in the order 'Wa, Wb, Rc, Wd, Re' to the same address. Read data returned for Re will be what was written by Wd. (Reads push writes.)

5) An A.G.P. Write operation may bypass previously issued A.G.P. Read operations. Read data returned may reflect data associated with a subsequently issued Write request. (Writes are allowed to pass reads)

Example: Requests in the order 'Wa, Wb, Rc, Wd, Re' to the same address. Read data returned for Rc may be either what was written by Wb or Wd. Wd is returned when Wd passes Rc.

6) PCI transactions initiated by an A.G.P. compliant Master or Target must follow the ordering rules specified in the PCI specification.

## Explanation of A.G.P. interface Ordering Rules (explanations correspond to rules with the same number):

- 1) When an A.G.P. compliant agent is capable of generating both PCI and A.G.P. transactions, the A.G.P. compliant target is not required to maintain any ordering between these two streams. However, the A.G.P. compliant target is required to maintain order within a given stream based on the ordering rules for that stream. For example, a master issues a PCI and an A.G.P. transaction. The order in which the A.G.P. and PCI transactions complete does not matter. These are two different streams of requests and different streams have not ordering relationships.
- 2) Even though the A.G.P. compliant Target will return a stream of A.G.P. Read data in the same order as requested, this does not mean that the read transactions actually occur at the destination in the same order as requested. For example, a master enqueues Read x and then Read y. Before the read data is returned, a Write to location x and then a Write to location y occurs. Because of the ordering rules defined above, it is possible for the Read to location x to return old or new data and the Read to location y to return old or new data. Note that if the Read to location x returns new data it does not imply that the Read to location y will also return new data. If the Read to location x returned old data, it is possible for the Read to location y to return new data. The value that is returned is determined by the A.G.P. compliant target after the requests have been enqueued and before data is returned to the master. The ordering rules as described above only require that the data being returned to the master be delivered in the same order as requested. The A.G.P. compliant target is allowed to re-arrange Read requests to improve performance, but is never allowed to return data in a different order than requested. Another example is where the master requested Read A, B, C and then D. However, the memory controller is allowed to obtain the data C, B D and then A, but is required to return the data in A, B C and then D.
- 3) This rule means that A.G.P. write data cannot pass previously written A.G.P. data.
- 4) Read requests will push Write data (within the same stream) from an A.G.P. compliant master. An A.G.P. read to a location previously written by an A.G.P. write operation will return the latest copy of the data seen by the A.G.P. interface.
- 5) This rule means that an A.G.P. write issued after an A.G.P. read is allowed to pass the read request and may cause the read to return the new value of the data even though the write request and data transfer occurred on the bus after the read request occurred. To ensure that the old value is returned, the A.G.P. compliant master must not issue the write transaction until after the read data has returned from the read request or issue a FENCE command (which is discussed in section 3.2.3) between the read and the write.

#### Implications of Allowing A.G.P. Writes to Pass A.G.P. Reads

A potential problem created by allowing A.G.P. Writes to pass A.G.P. Reads is that an A.G.P. Read may return "old" data from a previous A.G.P. Write or "new" data from a following A.G.P. write. An A.G.P. Read sandwiched by A.G.P. Writes may return data for either write — it is indeterminate. This is shown in the following example:

A 3D Graphics controller Master generates the following sequence of pipelined A.G.P. requests. In this example the reads are from the frame buffer, texture buffer, and depth buffer respectively, while the writes are to the frame buffer. This example assumes that all frame buffer accesses are to the same address.

| Request Order: | 1    | 2   | 3   | 4   | 5   | 6   | 7   | 8   | 9   |
|----------------|------|-----|-----|-----|-----|-----|-----|-----|-----|
|                | W Oc | R2c | R2t | R2z | W1c | R3c | R3t | R3z | W2c |

The following diagram shows W1c passing R2t. In this case R2c will return "old" data from the W0c write.

| Request Order: | 1    | 2   | 3   | 4   | 5   | 6   | 7   | 8   | 9   |
|----------------|------|-----|-----|-----|-----|-----|-----|-----|-----|
|                | W Oc | R2c | R2t | R2z | W1c | R3c | R3t | R3z | W2c |
|                |      | X   |     |     |     |     |     |     |     |

The following diagram shows W1c passing R2c. In this case R2c will return "new" data from the W1c write.



The following diagram shows both W1c and W2c passing R2c. In this case R2c will return "new" data from the W2c write. (In this graphics controller example write W2c is dependent on R2c data returning. So in reality the write request W2c will not be generated before the read data for R2c is returned. However if the requests were pipelined deeper it would be possible for several writes to pass a particular read.)



7) PCI transactions generated by a device on the A.G.P. interface follow the same rules as a device that resides on a PCI bus segment. The PCI agent transactions will follow the same rules as described in the PCI bus specification even though it was initiated on the A.G.P. interface. This agent's transactions have no ordering with respect to any A.G.P. transactions that occur.

#### A.G.P. Compliant Master Implications of Allowing Writes to Pass Reads

If an A.G.P. compliant Master doesn't care if it gets "old" or "new" data for a given read operation no special action needs to be taken. If an A.G.P. compliant master is particular about getting "new" or "old" data it is the A.G.P. Compliant Master's responsibility to ensure that it gets the correct data. There are various methods to ensure this. Some of these methods are discussed below.

If an A.G.P. Compliant Master must get "new" data it may:

Detect that a conflict exists between a Read request that has already been generated and an internally pending Write request, merge (or substitute) the "new" Write data with the "old" Read data when it is returned; or

Delay the Read request behind the Write request. Since Reads "push" Writes per the ordering rules, the Read will return the "new" data. Since it is desirable to deeply pipeline the A.G.P. requests, actually determining that a conflict exists between a Read and a subsequent Write may be difficult (or impossible) to detect. Once a conflict is detected delaying the Read may stall the pipeline and impact performance.

If an A.G.P. Compliant Master must get "old" data it may:

Issue a FENCE command between the Read and the following Write or

Delay the "new" data Write until the "old" Read data has been returned. This method has the potential for deadlock. A deadlock occurs when delaying a Write causes an A.G.P. Compliant Master's data engine to back-up. If the A.G.P. Compliant Master's Read data return buffers are full the stalled data engine can't remove Read data from the buffers. The "old" Read data can't be accepted and the Write will continue to be delayed, creating the deadlock.

#### 3.2.2 Deadlock Avoidance

An A.G.P. Compliant Master can not make the data transfer phase of a previously issued request dependent on the completion of *any* other A.G.P. or PCI transaction to the device as either a Master or Target.

#### 3.2.3 Flush and Fence Commands

Because of the ordering rules of the A.G.P. interface, the master can not guarantee when write data has reached its final destination. From the A.G.P. compliant master's standpoint a write transaction appears to have completed but the write data may still be pending in the A.G.P. interface. The master also needs the ability to ensure that certain transactions complete on the A.G.P. interface before other transaction are issued. This can be accomplished by delaying the issuing of subsequent requests until previous requests complete, but this defeats the use of pipelining to improve system performance. The FLUSH command causes A.G.P. transactions to become visible to the rest of the system so synchronization events may occur. The FENCE command guarantees what order accesses will complete in, without delaying the issuing of subsequent commands. Each will be discussed in more detail in the following paragraphs. The FENCE and FLUSH commands are low priority commands and have no affect on high priority requests.

#### Flush:

Under most conditions the master does not care if its transactions are visible to the system or not. But in those cases when it does matter, the FLUSH command is used by an A.G.P. compliant master. The FLUSH command ensures that its low and high priority write transactions have become visible to the rest of the system. Because of the A.G.P. ordering rules, the master can not cause accesses to become visible to the system by using the memory commands like is possible when using PCI commands. Memory commands can only cause data to be returned in a specific order; they place no requirement on the core logic to make

accesses visible to the system. However, the core logic must cause A.G.P. accesses to become visible to the rest of the system when the FLUSH command is issued. The FLUSH command behaves similar to a low priority read command except that a single Qword of random data is returned. The return of the random data is the acknowledgment to the master that all previous low and high priority write transactions have become visible to the system. When the FLUSH command completes, the master may safely cause a synchronization event to occur.

Take the case when the A.G.P. compliant master writes data to memory, but does not use the FLUSH command before generating an interrupt. The driver reads its device and determines that data is valid in memory. When it accesses main memory (from the CPU) it may access stale data because the data is still in the A.G.P. domain. In the PCI domain, this sequence was all that was required to guarantee that the correct data would be accessed. However, for A.G.P. this is not sufficient. Since A.G.P. accesses have no ordering with respect to any other accesses in the system (in this example from the CPU), the A.G.P. interface is not required to flush posted write data before completing a read to the A.G.P. interface. Therefore, the posted write data may still reside in the A.G.P. interface and the driver may access stale data. For PCI transactions, the flush of posted data on any read causes loss of performance in the system and generally is only required in certain cases. The FLUSH command provides a mechanism for the master to ensure that the correct data will be accessed when a synchronization event occurs, but does not force the system to flush buffers when not required.

The FLUSH command occupies a slot in the Transaction Request Queue when issued and is retired from the queue when the associated single Qword of data is returned. The only limit to the number of outstanding FLUSH requests are the limits of the transaction request queue itself. It is possible to have the Transaction Request Queue full of FLUSH commands.

#### Fence:

Because of the A.G.P. ordering rules, the master needs a mechanism that forces writes not to pass previously enqueued read commands. An A.G.P. compliant master uses the FENCE command to demarcate one set of A.G.P. requests from another. The FENCE command affects the order in which they are completed in memory and may not necessarily determine the order in which they complete on the bus. On either side of the demarcation, A.G.P. requests are processed based on the A.G.P. ordering rules. However, ALL requests generated prior to the FENCE command are processed prior to ANY request following the FENCE command. A.G.P. Write requests generated after a FENCE command may not pass any Read requests generated prior to the FENCE command. Read requests issued after the FENCE command may not be combined with or pass any Read request issued prior to the FENCE command.

High Priority requests are exceptions and are allowed to pass the demarcation established by the FENCE command. The FENCE command does not occupy a slot in the Request queue of the A.G.P. compliant master or target. An A.G.P. compliant master may generate an unlimited number of FENCE commands.

## 3.2.4 Access Request Priority

The A.G.P. bus command set supports two levels of access priority. In general, the high priority queue has the highest priority for memory service, and the low priority queue has lower priority than the processor, but generally higher than any other subsystem for memory service. The high priority queue should be used with caution since it causes additional latency to other requests. For example, the high priority queue may be useful for a graphics controller reading display memory or to avoid overflow/underflow in a data stream having real-time deadlines. The high priority queue is intended for very selective use when an A.G.P. request needs immediate processing.

Requests in the high priority queue may bypass all other (low priority or PCI) requests and may be returned out of order with respect to other streams. Only requests that can tolerate re-ordering (with respect to all accesses other than themselves) should be completed using a high priority command. High priority accesses only have order with respect to the same type of request. For example, high priority read requests only have ordering with respect to other high priority read requests. High priority write accesses only have ordering with respect to other high priority write accesses. Unlike low priority operations, there are no ordering requirements between high priority read and high priority write accesses. The sequence, HPR-A, HPW-B, HPR-C and HPW-D will be used in the following discuss. Read data will be returned in the order in which read accesses were requested. In this example, A will always complete before by C. Write data will always complete in the order requested, in this example write B will always complete before write D. There is no order between read and write high priority operations. In this example, the accesses may complete A, C, B and D; A, B, C and D; B, A, D and C; or B, D, A, and C. However, the order can NEVER be C completes before A or D completes before B.

Both Read and Write requests may be issued as high priority accesses.. The A.G.P. protocol designates read replies as part of either the high or low priority stream, enabling the bus master which originated the access to associate the reply with the correct outstanding request. Writes issued as high priority accesses will have transferred the data across the interface within the maximum latency window established for high priority accesses. This does not imply that the data will have been retired to main memory within this latency window.

### 3.2.5 Flow Control

### 3.2.5.1 Introduction

Flow control on A.G.P. is different than that of PCI. On PCI, the master and target may delay the transfer of data on any data phase. Before each data phase can complete both the master and target must agree that data can be transferred by asserting their respective **xRDY#** signal. When either is not prepared to transfer data, the current data phase is held in wait states. PCI also allows the target to indicate to the master that it is not capable of completing the request at this time (Retry or Disconnect). Only when both agents agree to transfer data does data actually transfer.

On A.G.P., flow control is over blocks of data and not individual data phases. Flow control will be discussed with respect to Initial Blocks and Subsequent Blocks. Some transactions only have initial blocks, this occurs when the entire transaction can be completed within 4 clocks. For transactions that require more than 4 clocks to complete, they are comprised of both an Initial Block and one or more Subsequent Blocks. A block is defined as four A.G.P. clocks and is 8 byte aligned, but is not required to be cacheline aligned. Depending on the transfer mode, the amount of data that is actually transferred may change. But in all cases, the number of clocks between throttle points is always four. Flow control on A.G.P. refers to the initial or subsequent data block. Table 3-3 lists the control signals and which agent drives them and which agent receives them. The table following the diagram lists the flow control of initial and subsequent blocks based on transaction type and which agent is allowed to flow control the data movement.

| Flow Control Point ><br>Transaction Type | Target - Initial<br>Data Block | Target -<br>Subsequent<br>Data Block | Master - Initial<br>Data Block | Master -<br>Subsequent<br>Data Block |
|------------------------------------------|--------------------------------|--------------------------------------|--------------------------------|--------------------------------------|
| Low Priority Read<br>Data                | TRDY#                          | TRDY#                                | RBF#                           | IRDY#                                |
| High Priority Read<br>Data               | TRDY#                          | TRDY#                                | None                           | IRDY#                                |
| Write Data                               | GNT#/ST[2::0]                  | TRDY#                                | IRDY#                          | None                                 |

| Table 3- | <b>3</b> A.G.P | . Flow | Control | Points    |
|----------|----------------|--------|---------|-----------|
|          |                |        | 001101  | 1 0 11100 |

There are basically three times in which the transfer of data can be delayed:

- 1. After the data has been requested but before initial data block is returned;
- 2. During the initial data block of the transaction; and
- 3. for Subsequent data block(s).

The first case occurs when the master is allowed to delay the return of read data for a Low Priority (LP) transaction by the use of **RBF#** (Read Buffer Full). This signal implies that the Read Buffer is currently full and that the arbiter is not allowed to attempt to return LP read data.

The second case occurs for both the master and target. The target is allowed to delay the completion of the first data phase of the initial block of read data by not asserting **TRDY#**. Since the control and data are moving in the same direction, the master latches the read data and qualifies it with **TRDY#**. When **TRDY#** is asserted, the data is valid on each clock until the block completes. The master is not allowed to flow control the initial block of a High Priority (HP) Read transaction. However, like LP reads, the master is allowed to flow control HP reads like LP reads on Subsequent blocks of data. The master is allowed to delay the first data phase of a write transaction by not asserting **IRDY#**. The agent receiving data is not allowed to insert wait states on the initial block of data for either read or write transactions. The agent sending data is allowed to delay the initial block of data of either a read or write up to a maximum of one clock from when the transaction was allowed to start.

The third or last case, is where both the master and target are allowed to insert waitstates on subsequent blocks of read data. The master is not allowed to insert waitstates on subsequent write transactions, but is allowed to on read transactions (both High and Low Priority).

For the throttle point (TP), there is no specified limit to how long **IRDY#** or **TRDY#** may be deasserted. However, the master must realize that inserting even one wait state at any TP of a read transaction may invalidate the latency guarantee of **all** outstanding high priority requests. If the master inserts a wait state at a TP it can not make any assumptions about what impact the wait state will have on the latency guarantee. For instance, inserting 5 wait states at a TP of Read A (high or low priority) does not mean that outstanding high priority read request B will complete in x + 5 clocks (where x is the latency guarantee provided by the core logic). The target must include any potential **TRDY#** throttle point wait states in its latency guarantee. The specific latency behavior of a target when a master inserts a waitstate is implementation specific. Refer to the data sheet of the specific device to understand this affect.

#### 3.2.5.2 Read Flow Control

#### Initial Master Flow Control (Low Priority Reads)

**RBF#** (Read Buffer Full) is an output of the Master and indicates whether it can accept low priority read data or not. What affect the assertion of **RBF#** has on the data transfers depends on the length of the next transaction and the rate at which data is being transferred. If the master has **RBF#** deasserted it must be able to accept the following transactions assuming that the master asserts **RBF#** on the clock in which the grant is received:

- For transactions that can be completed in 4 clocks or less, the master is required to accept the entire transaction without wait states regardless of the data transfer mode. When the transaction requires more than 4 clocks to complete, the master is allowed to insert waitstates after each 4 clocks in which data is transferred.
- For 1x data transfers, the master must accept the entire transaction without wait states when the length is less than or equal to 16 bytes. When the transfer length is greater than 16 bytes, the master is allowed to flow control after each 16 byte transfer. When the length is 8 bytes or larger, the master has sufficient time to assert **RBF#** to prevent the arbiter from initiating the return of more LP read data.

- For 2x data transfers, if a low priority read transaction's length is *greater* than 8 bytes the master must accept only the one low priority read transaction, because the master has sufficient time to assert **RBF#** to prevent the arbiter from initiating the return of more read data. When the transfer size is greater than 32 bytes, the master is allowed to flow control after the transfer of each 32 byte block.
- For 2x data transfers, if the first low priority read transaction's length is *equal* to 8 bytes the master must accept two low priority read transactions. The first transaction must be accepted without flow control. The master must also accept the entire second transaction without flow control when its length is less than or equal to 32 bytes. When the second transaction's length is greater than 32 bytes, the master must accept the initial 32 bytes of the transaction, but is then allowed to flow control the subsequent 32 byte block(s).

Note: The arbiter must delay the assertion of **GNT#** for a subsequent read data transfer so that it is sampled asserted on the same clock edge as the last data phase for the previous read transaction when it is greater than 8 bytes. In order to allow full performance of 8 byte read transfers, the arbiter must pipeline the assertion of **GNT#** in a back-to-back fashion, otherwise dead clocks will appear on the **AD** bus. If the arbiter did not delay the subsequent **GNT#** in this manner the master would need a minimum of 64 bytes of buffering instead of 40 bytes for the 2x transfer mode, for example.

Table 3-4 shows the minimum amount of available data buffering required in the master when **RBF#** is deasserted. The table only applies to 2x data transfer mode<sup>10</sup>.

| 1st Read<br>Transaction | 2nd Read<br>Transaction | Buffer space needed to de-assert <b>RBF#</b> |
|-------------------------|-------------------------|----------------------------------------------|
| 8 bytes                 | $8 \le n \le 32$ bytes  | 8 + n bytes                                  |
| 8 bytes                 | n > 32 bytes            | 40 bytes                                     |
| 16 bytes                | don't care              | 16 bytes                                     |
| 24 bytes                | don't care              | 24 bytes                                     |
| 32 bytes                | don't care              | 32 bytes                                     |
| > 32 bytes              | don't care              | 32 bytes                                     |

**Table 3-4** Data Buffering for 2x Transfers

If the master can't accept the above transaction(s) it asserts **RBF#**. The A.G.P. compliant arbiter will not assert subsequent grants for low priority read data while **RBF#** is sampled asserted. In the event that **GNT#** and **RBF#** are asserted on the same clock, the master must be able to accept at least 4 clocks worth of data, the amount of data is dependent on the transfer mode. For the 2x mode, at least 32 bytes of data for the next low priority read transaction must be accepted.

Note: The A.G.P. compliant master has many implementation alternatives that can be predicated by buffer budget and complexity. For example the A.G.P. compliant master could restrict itself to generating only 16 byte low priority read transactions. In this case only 16 bytes of buffering need to be available in order to deassert **RBF#**. If an A.G.P. compliant master restricts itself to 8 and 16 byte low priority read

<sup>&</sup>lt;sup>10</sup> For 1x data transfer mode, the amount of data buffering required is simply enough to accept the next data transfer or up to 16 bytes, whichever is greater.

transactions, **RBF#** can be deasserted whenever 24 bytes of buffering are available when in 2x transfer mode. An A.G.P. compliant master that does not restrict the size of its low priority read requests needs a minimum of 40 bytes of buffering for 2x transfer mode. Optionally this master could dynamically alter the **RBF#** threshold point based on the size of the next two accesses. It is highly recommended that the master use **RBF#** only in unusual circumstances in which the target is able to provide data quicker than the master is able to consume it. In normal operations, the master should be able to consume that requested data faster than the target is able to provide it. The assertion of **RBF#** to stop the return of data should not be part of the "normal" behavior of the master. Figure 3-4 illustrates the enqueuing of two grants before the arbiter detects that **RBF#** is asserted.



Figure 3-4 Maximum number of GNT#s Queued before the assertion of RBF#

Since **RBF#** is deasserted on clock 1, the A.G.P. compliant arbiter asserts **GNT#/ST[2::0]** to indicate the return of a low priority read data (D0) for clock 2. Since the first read data being returned is 8 bytes, the A.G.P. compliant arbiter continues asserting **GNT#/ST[2::0]** for clock 3 to allow the second transfer to occur without dead time on the **AD** bus between data transfers of D0 and D1. At this point the arbiter has indicated to the master that two transactions worth of data will be returned. Because **RBF#** is asserted on clock 3, the arbiter is not allowed to initiate the return of any more data after this point until **RBF#** is sampled deasserted again. The arbiter asserts **GNT#** for clock 6, since the master deasserted **RBF#** on clock 5 and the arbiter is ready to return more low priority read data to the master.

The master decodes the initial request (clock 2), determines that sufficient buffer space is not available for a subsequent transaction and asserts **RBF#**. Since **GNT#** and **RBF#** are both asserted on clock 3, the master must accept the second transaction. While the master keeps **RBF#** asserted, the arbiter is not allowed to initiate the return of any new low priority read data. However, the arbiter is allowed to return high priority read data, request (high or low priority) write data from the master, or grant the master permission to initiate requests. (See Figure 3-20.)

Since **GNT#** is asserted on clock 2 (**ST[2::0]** indicates the return of low priority read data), the master starts accepting data and qualifies it with **TRDY#** to determine when it is valid. Note: **TRDY#** is only asserted on the initial data transfer of this transaction since it will complete within four clocks. Once the initial data transfer completes, the master begins accepting data for the second transaction and qualifies that data with **TRDY#**. Note that **TRDY#** must be asserted on the initial data phase of each transaction.

**Initial Master Flow Control (High Priority Reads)** 

The master must always be able to accept read data for all high priority queued transactions that can complete within 4 clocks. When a high priority read request requires more than 4 clocks (multiple blocks) to complete, the master can throttle the transaction (and effectively stall subsequent high priority read data) with **IRDY#** after each data block transfers(this is discussed in the next section). **RBF#** does not apply to high priority read data and **IRDY#** can not be used to initially stall the return of high priority read data.

#### Throttling

Throttling applies uniformly to both low and high priority read data. Both the target and the master have the ability to throttle read data by adding wait states after each block of data transfers. If either the target or the master wants to throttle the transfer of a subsequent block of data, the target must have **TRDY#** or the master must have **IRDY#** deasserted two 1x clocks prior to when the subsequent block would begin to transfer. This is referred to as the throttle point (TP). Data transfer will resume two 1x clocks after both **IRDY#** and **TRDY#** are sampled asserted. If throttling is not required by either the master or the target, then both **IRDY#** and **TRDY#** will be asserted at the throttle point. A throttle point occurs every 4 clocks. **IRDY#** and **TRDY#** have no meaning between throttle points and may be deasserted. **IRDY#** and **TRDY#** also have no meaning on the last throttle point of a transaction that is equal to or less than a block. Note: that **IRDY#** for the master and **TRDY#** for the target must be actively<sup>11</sup> driven from the first TP until the completion of the last TP. Following the last TP, **xRDY#** must be deasserted and tri-stated. During a TP for a read data transfer, once **xRDY#** is asserted it must remain asserted until the current TP completes, which occurs when both **IRDY#** and **TRDY#** are asserted.

### 3.2.5.3 Write Data Flow Control

#### **Initial Target Flow Control**

The A.G.P. compliant arbiter will only assert **GNT#/ST[2::0]** for write data when the target can accept the entire transaction or the initial block. The initial flow control is the same for both high and low priority data write requests.

#### **Initial Master Flow Control**

The master samples **GNT#/ST[2::0]** asserted for write data and asserts **IRDY#** to begin the write data transfer. The master can delay the beginning of the write data transfer one clock by delaying the assertion of **IRDY#**. Figure 3-5 illustrates the maximum delay in which a master can take when providing write data. **IRDY#** must either be asserted on clock 3 (earliest data can be provided) or clock 4 (the latest in which data can be provided). Once the master asserts **IRDY#** it must transfer all write data associated with the transaction without wait states. Since the master is not allowed to insert waitstates on subsequent TP, **IRDY#** must be deasserted and tri-stated after it is asserted to start the data transfer. On read transactions, **IRDY#** is meaningless except during TPs and must be actively driven until the last TP completes, at which time **IRDY#** must be deasserted and tri-stated.

<sup>&</sup>lt;sup>11</sup> The **xRDY#** signal can be actively driven earlier when the transaction will not complete during the initial block. **xRDY#** is allowed to be deasserted and tri-stated between TPs although the timing diagrams do not illustrate this behavior.



Figure 3-5 Maximum Delay by Master on Write Data

#### Throttling

Since the master is aware of the quantity of data it wants to send and can generate smaller write request if necessary, the master is not allowed to throttle write data. Write data is only allowed to be throttled by the target. The target is only allowed to throttle after each block is transferred. When the target wants to throttle the transfer of a subsequent block of write data, it must have **TRDY#** deasserted at the throttle point which occurs two 1x clocks prior to when the subsequent block would begin to transfer. The transfer of the subsequent block of data will resume two 1x clocks after **TRDY#** is sampled asserted. If throttling is not required by the target, **TRDY#** will be asserted at the throttle point. **TRDY#** is meaningless (it may be asserted, deasserted or tri-stated<sup>12</sup>) between throttle points but must be actively driven during a TP. When the last TP completes **TRDY#** must be deasserted and tri-stated. **TRDY#** also has no meaning on the last throttle point of a transaction that less than or a multiple of a block. For example, if less than 4 clocks are required to complete the transaction completes before 10, the subsequent TP which would occur on clock 8, is not required and therefore does not exist. In this figure, the TP on clock 4 is the last and **TRDY#** must be deasserted on clock 6.

<sup>&</sup>lt;sup>12</sup> When tri-stated, it must have been driven deasserted before being tri-stated since this is a sustained tristate signal.



Figure 3-6 Write Data with One TP.

## 3.2.5.4 1 and 2 Clock Rule for IRDY# and TRDY#

For initial write data, **IRDY#** must be asserted by the master one or two clock edges after **GNT#** is sampled asserted when the **AD** bus is free and is one clock in duration. In the case where **GNT#** is pipelined to the master, **IRDY#** must be asserted on the first or second clock edge when the **AD** bus becomes free to complete the data transfer.

For initial read data, **TRDY#** must be asserted by the target one or two clocks after **GNT#** is sampled asserted when the **AD** bus is free and is one clock in duration. The target can not assert **TRDY#** on the same clock that it asserts **GNT#**. In the case where **GNT#** is pipelined, the one or two clock rule starts from the earliest time that **TRDY#** could be asserted.

## 3.2.5.5 Other Flow Control Rules

The agent receiving data should assert its flow control signal independent of the sender's flow control. For example, for low priority read data, the master must assert **RBF#** for the initial data block transfer and **IRDY#** for subsequent block data transfers independently of the assertion of **TRDY#**. On transfers of subsequent blocks of read data (where both **IRDY#** and **TRDY#** need to be asserted to continue), once **xRDY#** is asserted in a TP, it must remain asserted until both **IRDY#** and **TRDY#** are asserted on the same clock, which then completes the TP. In other words, once an agent has indicated that it is ready to continue the transfer it can not change its mind. Outside of the TP, the state of **xRDY#** is meaningless.

## 3.2.6 Source Throttling Address Flow Control

Address flow control for A.G.P. request is source controlled. This means that the A.G.P. compliant master is responsible for not enqueuing more requests than the target is capable of handling. System software reads the RQ field in the A.G.P. compliant target status register (see section 0) to learn the maximum number of requests that the target is capable of supporting. Software can also learn the maximum number of request slots supported by the master by reading RQ field in the A.G.P. compliant master is capable of supporting. Software then writes the master's  $RQ_DEPTH$  register in the command register with the value of the number of requests that the master can have outstanding. When the value is more than the master requested,

the master limits the number of outstanding requests by design. When the value is less than the master requested, the master is not allowed to enqueue more requests than the maximum value programmed. This guarantees that the A.G.P. compliant target's request queue will never overflow.

The A.G.P. compliant master must track the number of outstanding requests it has issued. A slot in the master's request queue is considered "used" whenever a read, write, or FLUSH command is issued to the target. The request queue slot becomes available again for another request when data associated with that request starts to transfers across the bus. Since a FLUSH command is treated like a LP read, it consumes a slot until the dummy read data is returned. When the number of outstanding requests reaches the allocated limit the master is not allowed to generate further Read, Write, or FLUSH requests until a slot is freed.

## 3.3 Pin Description

There are 16 new A.G.P. interface signals defined. The A.G.P. compliant target is required to support all 16 signals. The number of A.G.P. signals required by the master determines the performance of the master. There are basically two independent choices the master makes to determine its performance level:

- 1. How requests are transferred to the core logic; and
- 2. At what rate will data transfer.

All A.G.P. compliant devices are required to support 4 new signals (**ST[2::0]** and **RBF#**). The following table lists which other A.G.P. signals are required by the master based on A.G.P. functionality supported. The core logic is required to support all 16 signals.

| Table 3-5 A.G.P. Signals Required |                                            |                                                           |  |
|-----------------------------------|--------------------------------------------|-----------------------------------------------------------|--|
|                                   | Address Queuing on <b>AD</b> Bus           | Address Queuing on SBA Port                               |  |
| 1x Data Transfer Rate             | ST[2::0], RBF#, PIPE#                      | ST[2::0], RBF#, SBA[7::0]                                 |  |
| 2x Data Transfer Rate             | ST[2::0], RBF#, PIPE#,<br>AD_STB0, AD_STB1 | ST[2::0], RBF#, SBA[7::0],<br>AD_STB0, AD_STB1,<br>SB_STB |  |

In addition to the new A.G.P. signals, all A.G.P. enabled components must have the PCI pin complement on the A.G.P. port as described in Table 3-10. The A.G.P. signals follow the signal type definitions and naming convention used in the PCI Specification.

The following signal type definitions are from the view point of the A.G.P. compliant target.

| in    | <i>Input</i> is a input-only signal.                                                                                                                                                                                                                                                                                                                                                                                                                                                  |
|-------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| out   | Totem Pole Output is an active driver.                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| t/s   | Tri-State is a bi-directional, tri-state input/output pin.                                                                                                                                                                                                                                                                                                                                                                                                                            |
| s/t/s | <i>Sustained Tri-State</i> is an active low tri-state signal owned and driven by one and only one agent at a time. The agent that drives a s/t/s pin low must drive it high for at least one clock before letting it float. A new agent cannot start driving a s/t/s signal any sooner than one clock after the previous agent tri-states it. A pull-up is required to sustain the inactive state until another agent drives it, and must be provided by the A.G.P. compliant target. |

The following table lists the signal names in the first column, signal types in the second column and the signal descriptions in the third column. In the second column, the direction of a t/s or s/t/s signal is from the view point of the core logic and is represented in the parentheses. For example, **PIPE#** is a sustained tri-state signal (s/t/s) that is always an input for the core logic. The tables below describe their operation and use, and are organized in four groups: Addressing, Flow Control, Status and Clocking.

|           | Table 3-6 A.G.P. Addressing |                                                                                                                                                                                                                                                                                                                                                                                                                                                             |  |  |
|-----------|-----------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--|--|
| Name      | Туре                        | Description                                                                                                                                                                                                                                                                                                                                                                                                                                                 |  |  |
| PIPE#     | s/t/s<br>(in)               | <ul> <li><i>Pipelined</i> request is asserted by the current master to indicate a full width request is to be enqueued by the target. The master enqueues one request each rising edge of CLK while PIPE# is asserted. When PIPE# is deasserted no new requests are enqueued across the AD bus.</li> <li>PIPE# is a sustained tri-state signal from a <i>master (graphics controller)</i> and is an input to the target <i>(the core logic).</i></li> </ul> |  |  |
| SBA[7::0] | in                          | <i>Sideband Address port</i> provides an additional bus to pass address and command to the target from the master. <b>SBA[7::0]</b> are outputs from a master and an input to the target. This port is ignored by the target until enabled (see section 6.1.9).                                                                                                                                                                                             |  |  |

Table 3-6 contains two mechanisms to enqueue requests by the A.G.P. compliant master. The master chooses one mechanism at design time or during the initialization process and is not allowed to change during runtime. When **PIPE#** is used to enqueue addresses the master is not allowed to enqueue addresses using the **SBA** port. When the **SBA** port is used **PIPE#** can not be used.

| Table 3-7 A.G.P. Flow Control |      |                                                                                                                                                                                                                                                |  |
|-------------------------------|------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--|
| Name                          | Туре | Description                                                                                                                                                                                                                                    |  |
| RBF#                          | in   | <i>Read Buffer Full</i> indicates if the master is ready to accept previously requested low priority read data or not. When <b>RBF#</b> is asserted the arbiter is not allowed to initiate the return of low priority read data to the master. |  |

Table 3-7 contains the new Flow Control beyond normal PCI flow control already required. If the master is always ready to accept return data, the A.G.P. compliant master is not required to implement this signal and the corresponding pin on the target is tied (internally pulled up) in the deasserted state.

|          |      | Table 3-8         A.G.P.         Status         Signals                                                                                                                                                                                                                                                              |  |
|----------|------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--|
| Name     | Туре | Description                                                                                                                                                                                                                                                                                                          |  |
| ST[2::0] | out  | <i>Status</i> bus provides information from the arbiter to a Master on what it may do. <b>ST[2::0]</b> only have meaning to the master when its <b>GNT#</b> is asserted. When <b>GNT#</b> is deasserted these signals have no meaning and must be ignored.                                                           |  |
|          |      | 000 Indicates that previously requested low priority read or flush data is being returned to the master.                                                                                                                                                                                                             |  |
|          |      | 001 Indicates that previously requested high priority read data is being returned to the master.                                                                                                                                                                                                                     |  |
|          |      | 010 Indicates that the master is to provide low priority write data for a previous enqueued write command.                                                                                                                                                                                                           |  |
|          |      | 011 Indicates that the master is to provide high priority write data for a previous enqueued write command.                                                                                                                                                                                                          |  |
|          |      | 100 Reserved (Arbiter must not issue. May be defined in the future by Intel.)                                                                                                                                                                                                                                        |  |
|          |      | 101 Reserved (Arbiter must not issue. May be defined in the future by Intel.)                                                                                                                                                                                                                                        |  |
|          |      | 110 Reserved (Arbiter must not issue. May be defined in the future by Intel.)                                                                                                                                                                                                                                        |  |
|          |      | <ul> <li>111 Indicates that the master has been given permission to start a bus transaction. The master may enqueue A.G.P. requests by asserting <b>PIPE#</b> or start a PCI transaction by asserting <b>FRAME#</b>. <b>ST[2::0]</b> are always an output from the Core logic and an input to the master.</li> </ul> |  |

Table 3-8 describes the status signals, their meaning and indicate how the AD bus will be used for subsequent transactions. The AD bus can be used to enqueue new requests, return previously requested read data, or request the master to provide previously enqueued write data. The ST[2::0] are qualified by the assertion of GNT#.

| Table 3-9 A.G.P. Clock list |                   |                                                                                                                                                         |  |
|-----------------------------|-------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------|--|
| Name                        | Туре              | Description                                                                                                                                             |  |
| AD_STB0                     | s/t/s<br>(in/out) | <b>AD</b> <i>Bus Strobe 0</i> provides timing for 2x data transfer mode on the <b>AD[15::00]</b> . The agent that is providing data drives this signal. |  |
| AD_STB1                     | s/t/s<br>(in/out) | <b>AD</b> <i>Bus Strobe 1</i> provides timing for 2x data transfer mode on the <b>AD[31::16]</b> . The agent that is providing data drives this signal. |  |
| SB_STB                      | s/t/s<br>(in)     | <i>SideBand Strobe</i> provides timing for <b>SBA[7::0]</b> and is always driven by the A.G.P. compliant master (when supported).                       |  |
| CLK                         | t/s<br>(in)       | <i>Clock</i> provides timing for A.G.P. and PCI control signals.                                                                                        |  |

## 3.4 A.G.P. Semantics of PCI signals

PCI signals, for the most part, are redefined when used in A.G.P. transactions. Some signals have slightly different semantics. **FRAME#**, **IDSEL**, **STOP#**, and **DEVSEL#** are not used by the A.G.P. protocol. The exact role of all PCI signals during A.G.P. transactions are described in **Table 3-10**.

July 31, 1996

|            | Table 3-10 PCI signals in relation to A.G.P.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
|------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| FRAME#     | Not used. <b>FRAME#</b> remains deasserted by its own pull up resistor.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |
| IRDY#      | New meaning. <b>IRDY#</b> indicates the A.G.P. compliant master is ready to provide <i>all</i> write data for the current transaction. Once <b>IRDY#</b> is asserted for a write operation, the master is not allowed to insert waitstates. The assertion of <b>IRDY#</b> for reads, indicates that the master is ready to transfer a subsequent block of read data. The master is <i>never</i> allowed to insert a wait state during the initial block of a read transaction. However, it may insert waitstates after each block transfers. ( <i>There is no</i> <b>FRAME# IRDY#</b> <i>relationship for A.G.P. transactions.</i> ) |
| TRDY#      | New meaning. <b>TRDY#</b> indicates the A.G.P. compliant target is ready to provide read data for the entire transaction (when transaction can complete within four clocks)a block) or is ready to transfer a (initial or subsequent) block of data, when the transfer requires more than four clocks to complete. The target is allowed to insert wait states after each block transfers on both read and write transactions.                                                                                                                                                                                                       |
| STOP#      | Not used by A.G.P                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    |
| DEVSEL#    | Not used by A.G.P                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    |
| IDSEL      | Not used by A.G.P                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    |
| PERR#      | Not used by A.G.P. (Optional for PCI operation per exceptions granted by PCI 2.1 specification.)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
| SERR#      | Same as PCI. (May be used by an A.G.P. compliant master to report a catastrophic error when the core logic supports a <b>SERR#</b> <sup>13</sup> pin for the A.G.P. port.)                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
| REQ#       | Same as PCI. (Used to request access to the bus to initiate a PCI or an A.G.P. request.)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |
| GNT#       | Same meaning as PCI but additional information is provided on <b>ST[2::0]</b> . The additional information indicates that the master is the recipient of previously requested read data (high or low priority), it is to provide write data (high or low priority), for a previously enqueued write command or has been given permission to start a bus transaction (A.G.P. or PCI).                                                                                                                                                                                                                                                 |
| RST#       | Same as PCI.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
| AD[31::00] | Same as PCI.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |

<sup>&</sup>lt;sup>13</sup> An A.G.P. **SERR#** signal cannot be tied to the PCI **SERR#** signal because of different clocking domains. The assertion of **SERR#** must meet setup and hold times to **CLK**.

| C/BE[3::0]#  | Slightly different meaning. Provides command information (different commands than PCI) by the master when requests are being enqueued using <b>PIPE#</b> . Provides valid byte information during A.G.P. write transactions and is driven by the master. The target drives to "0000" during the return of A.G.P. read data and is ignored by the A.G.P. compliant master. |
|--------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| PAR          | Not used by A.G.P                                                                                                                                                                                                                                                                                                                                                         |
| LOCK#        | Is not supported on the A.G.P. interface for either A.G.P. or PCI transactions.                                                                                                                                                                                                                                                                                           |
| INTA#, INTB# | Interrupt request signals <sup>14</sup> are the same as PCI and follow the same usage.<br>(Must be level sensitive and shareable.) <b>INTA#</b> for a single function device,<br><b>INTB#</b> for a two function device. <b>INTC#</b> and <b>INTD#</b> are not supported on<br>the A.G.P. connector.                                                                      |

## 3.5 Bus Transactions

### 3.5.1 Address Transactions

As described earlier in this A.G.P. interface specification there are two ways to enqueue requests, the **AD** bus or the **SBA** port. If the master chooses the **SBA** port it is not allowed to asserted **PIPE#** for any transactions. If the master uses **PIPE#** to enqueue requests it is not allowed to use the **SBA** port. The following diagrams will first illustrate the enqueuing of addresses on the **AD** bus and then on the **SBA** port.

## 3.5.1.1 AD Bus

The master requests the permission from the core logic to use the **AD** bus to initiate an A.G.P. request or a PCI transaction by asserting **REQ#**. The arbiter grants permission by asserting **GNT#** with **ST[2::0]** equal to "111" hereafter referred to as *START*. When the master receives START it is required to start the bus operation within two clocks of when the bus becomes available. For example, when the bus is in an idle condition when START is received, the master is required to initiate the bus transaction on the next clock and the one following. When a transaction is currently active on the bus becomes available. For example, when the master is required to start the bus operation within 2 clocks of when the bus becomes available. For example, when the current transaction is an A.G.P. read (from target to master), a turn around cycle is required between the last data phase of the read and the start of the request or assertion of **PIPE#** or **FRAME#**. In this example, the master is required to start the request either two or three clocks from the completion of the last data. The next clock after is not allowed since a turn around cycle is required when ownership of the **AD** bus changes. Once this has occurred the master is required to start the transaction, the turn around cycle is not required and therefore the master must initiate the transaction the clock following the completion of the last data phase or the clock after. Each of these relationships is described in section 3.6.

<sup>&</sup>lt;sup>14</sup> These can be tied to the PCI **INTx#** signals since these are o/d signals and are level sensitive.



Figure 3-7 Single Address - No Delay by Master

Figure 3-7 illustrates a single address being enqueued by the master. Sometime before clock 1, the master asserted **REQ#** to gain permission to use the **AD** bus. The arbiter grants permission by indicating START on clock 2. A PCI only master is not required to start the transaction within 2 clocks. It is a violation of the A.G.P. protocol if an A.G.P. compliant master delays starting a request (assertion of **PIPE#** or **FRAME#**) by more than 2 clocks. A new request (address, command and length) are enqueued on each clock in which **PIPE#** is asserted. The address of the request to be enqueued is presented on **AD[31::03]**, the length on **AD[2::0]** and the command on **C/BE[3::0]#**. In this figure, only a single address is enqueued since **PIPE#** is asserted for only a single clock. The master indicates that the current address is the last it intends to enqueue when **PIPE#** is asserted and **REQ#** is deasserted which occurs in the figure on clock 3. Once the arbiter detects the assertion of **PIPE#** or **FRAME#** (which occurs on clock 3), it deasserts **GNT#** (on clock 4).

A.G.P. Interface Specification



Figure 3-8 Multiple Addresses Enqueued, Maximum Delay by Master

Figure 3-8 illustrates a master that enqueues 5 requests, where the first request is delayed the maximum allowed by the A.G.P. interface specification, which is two clocks from the indication of START. In this case, START is indicated on clock 2, but the master does not assert **PIPE#** until clock 4. Figure 3-7 illustrates the earliest the master could start its request and Figure 3-8 illustrates the latest in which the master is allowed to start the request when the bus is idle. Note that **REQ#** remains asserted on clock 7 to indicate that the current request is not the last one. When **REQ#** is deasserted on clock 8 with **PIPE#** still asserted this indicates that the current address is the last one to be enqueued during this transaction. **PIPE#** must be deasserted on the next clock when **REQ#** is sampled deasserted. If the master desired to enqueue more requests during this bus operation, it would simply continue asserting **PIPE#** until all of its requests are enqueued or until the master has filled all the available request slots provided by the core logic.

The master is not allowed to insert any waitstates while enqueuing requests and the target has no mechanism to stop an address from being enqueued. Once **PIPE#** is asserted, every rising edge of **CLK**, enqueues a new request. The target has no mechanism to stop or delay the master from enqueuing more requests once the arbiter has indicated START. The clock following the last address, the master is required to tri-state the **AD** and **C/BE#** buses, unless the master is to provide write data (as indicated by **GNT#** and **ST[2::0]**) associated with a previously enqueued write request.

#### 3.5.1.2 SBA Port

An A.G.P. compliant master may choose to use **SBA** port to enqueue requests instead of using **PIPE#** and the **AD** bus. The **SBA** port is always driven by the master and if not enqueuing new requests, the master must drive the NOP command on the port which is signaled by driving **SBA**[7::0] to "1111 1111" or FFh for 1x data transfers and "1111 1111 1111 1111" or FFFh for 2x data transfers. The target ignores the **SBA** port until enabled to decode it. All commands on the **SBA** port always come in pairs except for the NOP case when 1x data transfer is done. In this case, a NOP is a single clock in duration. Unlike the enqueuing requests on the **AD** bus, the master does not request use of the port, but simply sends the request at anytime(when a request slot is available). If a subsequent command is near a previous command then only the lower address bits and length need to be transferred. The target will use previous upper address bits and command to initiate a new memory access. With this abbreviated addressing, the **AD** bus can be completely utilized transferring small pieces of data that are in close to each other. In the diagrams, the notion of "R1H

Petitioners HTC & LG - Exhibit 1024, p. 45

and R1L" indicate that this is request 1 high and request 1 low. High refers to the upper 8 bits (where the OP code resides) and Low refers to the lower 8 bits. A request can be a Type 1, Type 2 or a type 3 command as described in section 3.1.3.



Figure 3-9 1x SideBand Addressing

In Figure 3-9 the master sends the NOP encoding on clock 1 and sends the high bits of the of a Type x (1, 2 or 3) on clocks 2, 4 and 9 and the low order bits on clocks 3 5, and 10. The master send NOPs on clocks 6, 7 and 8 to indicated that the **SBA** port does not contain a new request. There is no specific sequence in which Type 1, 2 or 3 encodings are required to transfer the **SBA** port. In this figure, every non-NOP time could be a Type 1 or only Type 3 commands. Recall that memory accesses are only initiated when a Type 1 encoding is decoded by the target. A Type 2 simply stores updated middle addresses and command in the Type 2 register of the target. A Type 3 encoding updates the upper address bit in the Type 3 register. Only when a Type 1 command is received does the target reconstruct an address by using the Type 3 and Type 2 registers with the Type 1 value and enqueues it to the memory controller.



Figure 3-10 2x Side Band Addressing

Figure 3-10 illustrates the **SBA** port operating in 2x mode. In this mode, a new address is transferred across the **SBA** port each **CLK**. This figure is the same as the previous one except that both pieces of the encoding, the high and low portions, transfer across the port during a single **CLK** period.

#### 3.5.2 Basic Data Transactions

As described earlier, data transfers across the port as independent transactions from the request that initiated the data movement. The following sections will discuss data movement of 1x and 2x transfer modes, with each discussing basic read and write data transfers.

#### 3.5.2.1 1x Data Transfers





Figure 3-11 Minimum Delay by Target of Read Transaction

Figure 3-11 illustrates the returning of read data that was previously requested by the master. The bus is in an idle condition and the arbiter indicates to the master that the next transaction to appear on the **AD** bus is read data for the master. This is indicated by the assertion of **GNT#** with the **ST[2::0]** being 00x. To signal Low Priority Read data returning, the **ST** encoding would be "000", and High Priority Read data being indicated by "001". In the diagrams where the **ST** encoding is 00x, the data being moved could be low or high priority data. In those cases, that it makes a difference to the type of read data being returned the **ST** encodings will be either "000" or "001".

The master is informed that the read data is coming, when **GNT#** is asserted and **ST[2::1]** equals "00" which occurs on clock 2. The master knows that the next time **TRDY#** is asserted, that the **AD** bus contains valid data. Once **GNT#** has been asserted for read data, the master starts latching the **AD** bus on each rising clock and qualifies the data with **TRDY#**. When **TRDY#** is deasserted, the data is not valid. The entire transaction will complete without waitstates, once **TRDY#** is sampled asserted and the transaction completes within 4 clock. In this figure, only 16 bytes of data are being transferred to the master. Notice that **TRDY#** is a single pulse and that there is no **IRDY#** handshake as is done on PCI. When the transfer size of the read data can complete within 4 clocks, neither the master nor target are allowed to do flow control (waitstates) on the transaction. The **C/BE#** bus does not contain valid byte enables since the smallest addressable size of memory is 8 bytes and all 8 bytes are always returned. The **C/BE#** bus is driven by the A.G.P. compliant target to "0000" and the byte enables are ignored by the master. Once **TRDY#** has been asserted, it must be deasserted by the following clock (unless it will be asserted again) and tri-stated. This is shown in this figure by a solid line being driven high, then on the next clock the signal is tri-stated. The signal is held in this state by a pull-up. This is referred to as a sustained tri-state signal and is the same as **TRDY#** as defined by PCI.

July 31, 1996

Figure 3-11 illustrates the earliest the target can return data to the master once **GNT#** has been asserted indicating a read data transfer. Notice that there is no **PIPE#** or **SBA** port in the figure, the transaction in which data is returned to the master is the same no matter how the request was transferred to the target.



Figure 3-12 Minimum Delay on Back to Back Read Data

Figure 3-12 illustrates a stream of 8 byte read operations being returned to the master. This figure shows that the arbiter is indicating to the master that read data is being returned on every clock. Remember that the minimum transfer size is 8 bytes and in 1x transfer mode, it requires two clocks to return the data. Therefore enqueuing **GNT#**'s earlier accomplishes nothing. The arbiter will not assert **GNT#** for a new transaction until the last clock of the current read transaction.



Figure 3-13 Master does not Delay Providing Write Data

July 31, 1996

Figure 3-13 shows a basic write data transfer. The arbiter indicates to the master that write data should be provide to the core logic which is indicated by the assertion of **GNT#** and **ST[2::0]** being "010 or 011". The first being a low priority write data and the second indicating that it is a high priority write data. In this example the signaling is the same and therefore the "01x" value is used. The master is required to provide the write data within 2 clocks of the indication from the arbiter. In this example, the master provides the data immediately because the bus was idle. The assertion of **IRDY#** is for a single pulse and goes with the first piece of data to indicate to the target that data is valid. Once **IRDY#** has been asserted, data transfers 4 bytes per CLK until the transaction has completed (for transactions that complete within four clocks). In this example the transaction is 16 bytes and completes in 4 clocks. The master is required to deassert and then tri-state **IRDY#** after it was asserted. The data is transferred on the **AD** bus while the **C/BE[3::0]#** provide the byte enables. The byte enables indicate which byte lanes carry meaningful data. The target is not allowed to delay the movement of write data (initial data block) after **GNT#** and **ST** bus indicate a write data transfer.



Figure 3-14 Back to Back Write Data Transfers - no Delays

Figure 3-14 is an example of back to back write data transfers. Each of these transactions are 8 bytes and could be either high priority or low priority write data transfers. On clock 2, the arbiter indicates to the master to provide previously requested write data to the core logic. Since these are small transfers the arbiter provides a **GNT#** on every other clock. Since a new transaction begins on clock 3, 5, 7 and 9, the master asserts **IRDY#** on these clocks to indicate that the first piece of data of each transaction is valid on the **AD** bus.

#### 3.5.2.2 2x Data Transfers

This section discusses 2x data transfers. Basically this is the same as 1x clocking except an entire 8 bytes are transferred during a single **CLK** period. This requires that two 4 byte pieces of data are transferred across the **AD** bus per **CLK** period. First a read data transfers will be discussed and then a write transfer.

July 31, 1996



Figure 3-15 2x Read Data no Delay

Figure 3-14 is the same as Figure 3-11 except that 16 bytes are transferred in 4 clocks, while in this figure 32 bytes are transferred during the same 4 clocks. The control signals are identical. The AD\_STBx signal has been added when data is transferred at 8 bytes per CLK period. AD\_STBx represents AD\_STB0 and AD\_STB1 and are used by the 2x interface logic to know when valid data is present on the AD bus. The control logic (TRDY# in this case) indicates when data can be used by the internal consumer of the data.



Figure 3-16 2x Back to Back Read Data - No Delay

Figure 3-16 shows back to back 8 byte read transactions. The **ST[2::0]** toggle between "000" and "001" to illustrate that they are actually changing. However, they are not required to change between HP and low

priority in order to do back to back transactions. In this diagram, **TRDY#** must be asserted on each clock since a new transaction starts on each clock.



Figure 3-17 2x Basic Write No Delay

Figure 3-17 is a basic write transaction that transfers data at the 2x rate. This figure is the same as Figure 3-13 (1x basic write) and there is no difference in the control signals and only more data is moved. The normal control signals determine when data is valid or not. This diagram moves 32 bytes of data in the same time as 16 bytes are moved in Figure 3-13.



Figure 3-18 QuadWord Writes Back to Back - No Delays

Figure 3-18 instead of a single transfer with 32 bytes of data, these diagrams illustrates multiple 8 byte write operations. When the transactions are short, the arbiter is required to give grants on every clock or the **AD** bus will not be totally utilized. In this example a new write is started on each rising clock edge except clock

7, because the arbiter deasserted **GNT#** on clock 6. Since a new transaction is start on each **CLK**, the **IRDY#** signal is only deasserted on clock 7.

#### 3.5.3 Flow Control

#### 3.5.3.1 Initial Data Block



Figure 3-19 Master's Data Buffer is Full - No Delay on Read Data

Figure 3-19 is a diagram where the master indicates to the target that it can accept the current transaction but has no buffer available for any additional transactions at this time. This is indicated by the master by asserting **RBF#** on clock 3. In this figure, the master asserts **RBF#** the clock following the assertion of **GNT#** to prevent the arbiter from returning additional LP read transactions. As illustrated in the previous diagram, the arbiter only asserts **GNT#** on every other clock at 1x data rates because that is all that is required to keep the **AD** bus 100% busy. Enqueuing the **GNT#** earlier only causes the master to provide more buffering for the return of read data and does not improve performance or efficiency of the bus.



Figure 3-20 High Prority Read Data Returned While RBF# Asserted

Figure 3-20 is the same as Figure 3-19 except that the arbiter returns high priority read data when the master indicates that it is not capable of accepting read data by asserting **RBF#**. The target is allowed to return High Priority (HP) data at any time, including when the master is not able to accept LP read data. **RBF#** only applies to "LP read data and not to HP read data. The master indicates that it is ready to continue

accepting LP read data on clock 5 by deasserting **RBF#**. The arbiter indicates that HP read data is being returned on clock 4 by asserting **GNT#** and **ST[2::0]** are "001". The data transfer for HP read is the same as a read data, with **TRDY#** being asserted on the initial data phase.



Figure 3-21 2x Read Data with RBF# Asserted

Figure 3-21 shows the master asserting **RBF#** indicating to the target that the masters read data buffer is full and it is not capable of accepting any new data. In order to ensure that 8 byte read operations can occur without delays on the **AD** bus, the arbiter must enqueue a new grant on each clock. In this case, the master must be able to accept 2 transactions worth of data because the arbiter is driving the second grant on the same clock in which the master is driving **RBF#**. Therefore, the master must provide a minimum of buffer for two transactions, as in this case 8 + 32 bytes of buffering. In general the master for read transactions (assuming the master is capable of asserted **RBF#** the clock following the assertion for **GNT#** indicating that read data is being returned. If the master delays the assertion of **RBF#** or desires to minimize the frequency in which it stalls the return of read data, more buffering should be provided. This figure, interleaves HP read data when the master indicates that it can not accept any more LP read data. If HP data was not pending, the **AD** bus would have been dead on clocks 5 and 6.

July 31, 1996



#### 3.5.3.2 Subsequent Data Block

Figure 3-22 Throttle Point for Subsequent Data Block - No Delay

For a transaction that requires more than four clocks to complete, both the master and target are allowed to insert waitstates. This point is referred to as a throttle point (TP in the diagrams). After every four clocks that data transfers, both the master and target are allowed to delay the next block of data from being transferred. Neither agent can cause the subsequent block from transferring, but can only delay it. In Figure 3-22 the first TP point occurs on clock 3, while the second occurs on clock 7. The TP point always occurs two clocks before the subsequent block of data would start to transfer. Another way to describe it for the 1x data transfers, the TP occurs on the clock in which the  $9^{\text{th}}-12^{\text{th}}$  bytes transfer for the first TP and  $25^{\text{th}} - 28^{\text{th}}$  bytes for the second.



Figure 3-23 Master Delays Subsequent Data Block

July 31, 1996

Figure 3-23 illustrate the TP where the target indicated that it is ready to transfer data while the master delays the transfer by two clocks. When both **IRDY#** and **TRDY#** are asserted which occurs on clock 7, the subsequent block of data begins to transfer two clocks later. Note that once **TRDY#** is asserted it must remain asserted until **IRDY#** is asserted, at which point both must be deasserted. A wait state is inserted on the bus on clock 7 and 8 because **IRDY#** was not asserted on clocks 5 and 6. The TP starts on clock 5 and ends on clock 7. **xRDY#** must be actively driven during the entire TP.



Figure 3-24 Write with Subsequent Block - No Delay

Figure 3-24 is a write transaction that requires more than four clocks to complete and therefore has a subsequent block on data. This figure shows the transaction completing without waitstates because the target was ready for the subsequent block of data by asserting **TRDY#** for clock 4. Since the transaction does not cross into a second subsequent block, the throttle point which would occur on clock 8 is meaningless. Only the target is allowed to flow control a write once it has started to transfer data.

July 31, 1996



Figure 3-25 Target Delays Subsequent Write Data Block

Figure 3-25 is the same diagram as Figure 3-24 Write with Subsequent Block - No Delayexcept that the target is not ready in the case. Because **TRDY#** was deasserted on clocks 4 and 5, waitstates are inserted on clocks 6 and 7. Because **TRDY#** was asserted on clock 6, the first data phase of the second block is transferred on clock 8. The "w" on clocks 6 and 7 indicate that data could be driven or not driven but must be valid on clock 8. The master is not required to drive meaningful data for clocks 6 and 7 since **TRDY#** indicates that the target will not accept the data until clock 8.



Figure 3-26 Earliest Read Data Returned after a Request

July 31, 1996

Figure 3-26 illustrates the earliest that read data can be returned to a master following the enqueuing of an address by the master. A turn around cycle is required when ownership of the **AD** bus occurs. The data being returned on clock 6 was requested sometime ago and is not related to the address being enqueued on clock 4. Clock 5 is the earliest the master can be informed of the return of read data. Figure 8-44 illustrates when more requests are enqueued but the **GNT#** for the data transfer remains the same. Figure 8-45 illustrates the target inserting a wait state before providing the data. Again notice that **TRDY#** is only asserted for a single clock. Once **GNT#** has been asserted indicating the return of read data, the master watches for the assertion of **TRDY#** to indicate that the data for the first data phase is valid. Subsequent data phases complete one per clock for the entire transaction when the transaction completes within four clocks. For transactions that require more than four clocks to complete, a subsequent block is required and a TP is valid which occurs on clock 8.



Figure 3-27 Request followed by Write and then a Read

Figure 3-27 shows the sequence of enqueuing a single address, followed by write data being transferred, followed by read data being transferred. A turn around cycle is required on the bus each time ownership of the **AD** bus changes. The turn around on clock 3 occurs because the master did not know until clock 3 that it is to provide write data. If more requests had been enqueued then the turn around access could have been avoided. The master delayed the write data by one clock. Figure 8-51 and Figure 3-29 show multiple requests followed by write data where there is no turn around cycle.



Figure 3-28 Request followed by Read Data no Delay by Target.

Figure 3-28 shows the earliest that read data can be returned following a request being enqueued. One turn around clock is required since ownership of the **AD** bus occurs. Even without the turn around cycle the arbiter is not able to give a grant to start the read data transfer because, the **ST** bus must be held until the transaction starts, which occurs on clock 4. The arbiter then changes the encoding for clock 5.



Figure 3-29 Request followed by Write - no AD Turn around

Figure 3-29 illustrated how combining the process of enqueuing new requests and then moving write data eliminates a turn around cycle on the bus. Since **GNT#** is asserted on clock 3 to indicate that the master should provide write data immediately after the requests have been enqueued. The master does not tri-state the **AD** or **C/BE** buses, but starts driving write data. The assertion of **IRDY#** indicates that the write transaction has started and valid write data will appear on each clock. The target indicates that it is ready to accept the second block of data on clock 9 by asserting **TRDY#** on clock 7.

July 31, 1996



Figure 3-30 Basic PCI Transaction on A.G.P.

Figure 3-30 is a basic PCI transaction on the A.G.P. interface. If the PCI agent is a non A.G.P. compliant master, it ignores the **ST[2::0]** signals and everything appears to be a PCI bus. For those masters that are APG aware, then the **ST** bus indicates that permission to use the interface has been granted to initiate a request and not to move A.G.P. data.