`
`HyperTransportT” Technology
`
`“0 Link
`
`A High-Bandwidth IIO Architecture
`
`ADVANCED MICRO DEVICES, INC.
`One AMD Place
`
`Sunnyvale, CA 94088
`
`(#25012A)
`
`Page 1
`
`HyperTransportTM Technology [/0 Link
`
`July 20, 2001
`
`APPLE 1018
`
`APPLE 1018
`
`
`
`AMDII
`WHITE PAPER
`
`Abstract
`
`This white paper describes AMD’S HyperTransportTM technology, a new l/O
`
`architecture for personal computers, workstations, servers, high-performance networking
`
`and communications systems, and embedded applications. This scalable architecture can
`
`provide significantly increased bandwidth over existing bus architectures and can
`
`simplify in—the-box connectivity by replacing legacy buses and bridges. The
`
`programming model used in HyperTransport technology is compatible with existing
`
`models and requires little or no changes to existing operating system and driver software.
`
`Contents
`
`The I/O Bandwidth Problem ................................................................................................ 3
`The HyperTransportTM Technology Solution ....................................................................... 4
`Original Design Goals ................................................................................................... 5
`Flexible l/O Architecture ............................................................................................... 6
`Device Configurations .................................................................................................. 7
`Technical Overview ............................................................................................................. 9
`Physical Layer .............................................................................................................. 9
`Minimal Pin Count .................................................................................................. 9
`Enhanced Low-Voltage Differential Signaling ...................................................... 11
`Greatly Increased Bandwidth ............................................................................... 12
`Data Link Layer .......................................................................................................... 13
`Initialization .......................................................................................................... 13
`Protocol and Transaction Layers ................................................................................ 13
`Commands ........................................................................................................... 14
`Data Packets ........................................................................................................ 15
`Address Mapping ................................................................................................. 15
`IIO Stream Identification ...................................................................................... 15
`Ordering Rules ..................................................................................................... 16
`Session Layer ............................................................................................................. 17
`Standard Plug ‘n Play Conventions ..................................................................... 17
`Minimal Device Driver Porting .............................................................................. 18
`Link Width Optimization ....................................................................................... 18
`Link Frequency Initialization ................................................................................. 19
`Implementation Examples ................................................................................................. 19
`Daisy Chain ................................................................................................................ 19
`Switched Environment ................................................................................................ 20
`Multiprocessor System ............................................................................................... 20
`HyperTransportT'V' Technology Specifications ................................................................... 21
`HyperTransportTM Technology Consortium ...................................................................... 22
`Industry Support for HyperTransportT'V' Technology ......................................................... 23
`Summary ........................................................................................................................... 24
`Glossary ............................................................................................................................ 24
`AMD Overview .................................................................................................................. 25
`Cautionary Statement ....................................................................................................... 25
`
`Page 2
`
`HyperTransportTM Technology l/O Link
`
`July 20, 2001
`
`
`
` WHITE PAPER
`AMDZ'I
`
`The IIO Bandwidth Problem
`
`While microprocessor performance continues to double every eighteen months, the
`
`performance of the I/O bus architecture has lagged, doubling in performance
`
`approximately every three years, as illustrated in Figure 1.
`
`Performance
`
`133—200
`
`
`
`PCI-64I66
`4X AGP
`
`PCI 32/33
`
`VL-Bus
`
`
`16-Bit ISA
`PMCA
`
`
`
` 4'77 8 Bit ISA
`1980
`1985
`1990
`1995
`2000
`
`Figure 1. Trends in IIO Bus Performance
`
`This I/O bottleneck constrains system performance, resulting in diminished actual
`
`performance gains as the processor and memory subsystems evolve. Over the past 20
`
`years, a number of legacy buses, such as ISA, VL-Bus, AGP, LPC, PCI—32/33, and
`
`PCI-X, have emerged that must be bridged together to support a varying array of devices.
`
`Servers and workstations require multiple high-speed buses, including PCI-64/66,
`
`AGP Pro, and SNA buses like InfiniBand. The hodge-podge of buses increases system
`
`complexity, adds many transistors devoted to bus arbitration and bridge logic, while
`
`delivering less than optimal performance.
`
`A number of new technologies are responsible for the increasing demand for
`
`additional bandwidth.
`
`3 High-resolution, texture-mapped 3D graphics and high-definition streaming video are
`
`escalating bandwidth needs between CPUs and graphics processors.
`
`
`
`3 Technologies like high—speed networking (Gigabit Ethernet, InfiniBand, etc.) and
`
`Wireless communications (Bluetooth) are allowing more devices to exchange growing
`
`amounts of data at rapidly increasing speeds.
`
`El Software technologies are evolving, resulting in breakthrough methods of utilizing
`
`Page 3
`
`HyperTransportTM Technology I/O Link
`
`July 20, 2001
`
`
`
` WHITE PAPER
`AMDI‘J
`
`multiple system processors. As processor speeds rise, so will the need for very fast,
`
`high-volume inter-processor data traffic.
`
`While these new technologies quickly exceed the capabilities of today’s PCI bus,
`
`existing interface functions like MP3 audio, v.90 modems, USB, 1394, and 10/100
`
`Ethernet are left to compete for the remaining bandwidth. These functions are now
`
`commonly integrated into core logic products.
`
`Higher integration is increasing the number of pins needed to bring these multiple
`
`buses into and out of the chip packages. Nearly all of these existing buses are single-
`
`ended, requiring additional power and ground pins to provide sufficient current return
`
`paths. High pin counts increase RF radiation, which makes it difficult for system
`
`designers to meet FCC and VDE requirements. Reducing pin count helps system
`
`designers to reduce power consumption and meet thermal requirements.
`
`In response to these problems, AMD began developing the HyperTransportTM I/O
`
`link architecture in 1997. HyperTransport technology has been designed to provide
`
`system architects with significantly more bandwidth, low-latency responses, lower pin
`
`counts, compatibility with legacy PC buses, extensibility to new SNA buses, and
`
`transparency to operating system software, with little impact on peripheral drivers.
`
`The HyperTransportT'V' Technology Solution
`
`HyperTransport technology, formerly codenamed Lightning Data Transfer (LDT),
`
`was developed at AMD with the help of industry partners to provide a high-speed, high-
`
`performance, point-to-point link for interconnecting integrated circuits on a board. With a
`
`top signaling rate of 1.6 GHz on each wire pair, a HyperTransport technology link can
`
`support a peak aggregate bandwidth of 12.8 Gbytes/s.
`
`The HyperTransport I/O link is a complementary technology for InfiniBand and
`
`le/ 10Gb Ethernet solutions. Both InfiniBand and high—speed Ethernet interfaces are
`
`high-performance networking protocol and box-to-box solutions, while HyperTransport
`
`is intended to support “in-the-box” connectivity.
`
`The HyperTransport specification provides both link— and system-level power
`
`management capabilities optimized for processors and other system devices. The ACPI-
`
`compliant power management scheme is primarily message-based, reducing pin-count
`
`requirements.
`
`HyperTransport technology is targeted at networking, telecommunications, computer
`
`and high performance embedded applications and any other application in which high
`
`speed, low latency, and scalability is necessary.
`
`Page 4
`
`HyperTransportTM Technology [/0 Link
`
`July 20, 2001
`
`
`
`AMDI‘J
`WHITE PAPER
`
`The general features and functions of HyperTransport technology are summarized in
`
`Table 1.
`
`
`Table 1. Feature and Function Summary
`
`
`
`
`
`it
`
`
`
`
`
`
`
`iriéiii_l!i,‘l‘/i;tli1iiilOll
`Bus Type
`”n" Wd’h
`
`Protocol
`
`or Burst Length
`
`
`
`
`
`
`i iv lots-ii ‘i’irgansii‘m "their iiuii’jt’zi’
`Dual, unidirectional, point-to-point links
`2, 4, 8, 16, or 32 bits
`
`
`
`7,"
`
`7
`
`if '
`
`i
`
`
`
`Packet-based, with all packets multiples of four bytes (32 bits).
`Packet types include Request, Response, and Broadcast, any of
`which can include commands, addresses, or data.
`
`1.2-v Low-Voltage Differential Signaling (LVDS) with a 100-ohm
`differential impedance
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`Memory model
`
`Coherent and noncoherent
`
`Original Design Goals
`
`In developing HyperTransport technology, the architects of the technology considered
`
`the design goals presented in this section. They wanted to develop a new I/O protocol for
`
`“in-the-box” l/O connectivity that would:
`
`El
`
`Improve system performance
`
`— Provide increased I/O bandwidth
`
`— Reduce data bottlenecks by moving slower devices out of critical information
`
`paths
`
`— Reduce the number of buses within the system
`
`— Ensure low latency responses
`
`— Reduce power consumption
`
`El Simplify system design
`
`— Use a common protocol for “in-chassis” connections to 1/0 and processors
`
`— Use as few pins as possible to allow smaller packages and to reduce cost
`
`Page 5
`
`HyperTransportT'V' Technology |/O Link
`
`July 20, 2001
`
`
`
`AMDI‘J
`WHITE PAPER
`
`El
`
`Increase I/O flexibility
`
`— Provide a modular bridge architecture
`
`— Allow for differing upstream and downstream bandwidth requirements
`
`El Maintain compatibility with legacy systems
`
`— Complement standard external buses
`
`— Have little or no impact on existing operating systems and drivers
`
`3 Ensure extensibility to new system network architecture (SNA) buses
`
` 3 Provide highly scalable multiprocessing systems
`
`Flexible IIO Architecture
`
`The resulting protocol defines a high-performance and scalable interconnect between
`
`CPU, memory, and I/O devices. Conceptually, the architecture of the HyperTransport I/O
`
`link can be mapped into five different layers, which structure is similar to the Open
`
`System Interconnection (OSI) reference model.
`
`In HyperTransport technology:
`
`CI The physical layer defines the physical and electrical characteristics of the protocol.
`
`This layer interfaces to the physical world and includes data, control, and clock lines.
`
`a The data link layer includes the initialization and configuration sequence, periodic
`
`cyclic redundancy check (CRC), disconnect/reconnect sequence, information packets
`
`for flow control and error management, and doubleword framing for other packets.
`
`CI The protocol layer includes the commands, the virtual chamiels in which they run,
`
`and the ordering rules that govern their flow.
`
`3 The transaction layer uses the elements provided by the protocol layer to perform
`
` :I The session layer includes rules for negotiating power management state changes, as
`
`actions, such as reads and writes.
`
`well as interrupt and system management activities.
`
`These fiinctions are completely described in the Hypchransport technology
`
`specifications, and several are discussed briefly in this white paper.
`
`Page 6
`
`HyperTransportTM Technology IIO Link
`
`July 20, 2001
`
`
`
`AMDII
`WHITE PAPER
`
`Device Configurations
`
`HyperTransport technology creates a packet-based link implemented on two
`
`independent, unidirectional sets of signals. It provides a broad range of system topologies
`
`built with three generic device types:
`
`El Cave—A single-link device at the end of the chain.
`
`El TunneZ—A dual—link device that is not a bridge.
`
`El Bridge—Has a primary link upstream link in the direction of the host and one or more
`
`secondary links.
`
`Example configurations include:
`
`:I A cave device connected directly to a host bridge.
`
`:I A chain of tunnel devices connected to a host bridge.
`
` :1 Multiple chains of tunnel devices connected to a bridge, which is then connected to a
`
`host bridge.
`
`D Multiple chains of tunnel devices connected to a switch, which is then connected to a
`
`host bridge.
`
`E] Any combination of the above.
`
`Page 7
`
`HyperTransportTM Technology [/0 Link
`
`July 20, 2001
`
`
`
`AMDEI
`WHITE PAPER
`
`Figure 2 shows several examples of the different device configurations possible in
`
`HyperTransport technology. In these figures, “P” indicates a primary interface and “S”
`
`indicates a secondary interface. (A brief glossary of some terminology used to describe
`
`the HyperTransport protocol can be found on page 24.) More detailed block diagrams for
`
`some example implementations are shown in the section beginning on page 19.
`
`
`
`Single-Link
`
`Tunnel
`
`Bridge
`Bridge
`with Tunnel without Tunnel
`
`
`
`Chain
`
`Tree
`
`Figure 2. Example HyperTransportT'V' Technology Device Configurations
`
`Page 8
`
`'
`
`HyperTransportTM Technology IlO Link
`
`July 20, 2001
`
`
`
`WHITE PAPER
`
`Technical Overview
`
`Physical Layer
`
`Each HyperTransport link consists of two point-to-point unidirectional data paths, as
`
`illustrated in Figure 3.
`
`El Data path widths of 2, 4, 8, and 16 bits can be implemented either upstream or
`
`downstream, depending on the device-specific bandwidth requirements.
`
`CI Commands, addresses, and data (CAD) all use the same set of wires for signaling,
`
`dramatically reducing pin requirements.
`
`All HyperTransport technology commands, addresses, and data travel in packets. All
`
`packets are multiples of four bytes (32 bits) in length. If the link uses data paths narrower
`
`than 32 bits, successive bit-times are used to complete the packet transfers.
`
`
`CTL
`—, V If,
`G E III II I, i- llI
`Eyelil 17 f
`
`
`-L;,,iIIL.!I \
`
`
`2, 4, 8, 16
`or 32 Bits
`
`Figure 3. HyperTransportT'Vl Technology Data Paths
`
`The HyperTransport link was specifically designed to deliver a high-performance and
`
`scalable interconnect between CPU, memory, and I/O devices, while using as few pins as
`
`possible.
`
`El To achieve very high data rates, the HyperTransport link uses low—swing differential
`
`signaling with on—die differential termination.
`
`E! To achieve scalable bandwidth, the HyperTransport link permits seamless scalability
`
`of both frequency and data width.
`
`Minimal Pin Count
`
`The designers of HyperTransport technology wanted to use as few pins as possible to
`
`enable smaller packages, reduced power consumption, and better thermal characteristics,
`
`while reducing total system cost. This goal is accomplished by using separate
`
`unidirectional data paths and very low-voltage differential signaling.
`
`Page 9
`
`HyperTransportTM Technology |/O Link
`
`July 20, 2001
`
`
`
` WHITE PAPER
`AMDI‘J
`
`The signals used in HyperTransport technology are summarized in Table 2.
`
`El Commands, addresses, and data (CAD) all share the same bits.
`
`Ci Each data path includes a Control (CTL) signal and one or more Clock (CLK)
`
`signals.
`
`— The CTL signal differentiates commands and addresses from data packets.
`
`— For every grouping of eight bits or less within the data path, there is a forwarded
`
`CLK signal. Clock forwarding reduces clock skew between the reference clock
`
`signal and the signals traveling on the link. Multiple forwarded clocks limit the
`
`number of signals that must be routed closely in wider HyperTransport links.
`
`3 For most signals, there are two pins per bit.
`
` :i
`
`In addition to CAD, Clock, Control, VLDT power, and ground pins, each
`
`HyperTransport device has Power OK (PWROK) and Reset (RESET#) pins. These
`
`pins are single-ended because of their low—frequency use.
`
`I: Devices that implement HyperTransport technology for use in lower power
`
`applications such as notebook computers should also implement Stop (LDTSTOP#)
`
`and Request (LDTREQ#). These power management signals are used to enter and
`
`exit low-power states.
`
`Table 2. Signals Used in HyperTransportTM Technology
`
`
`
`
`
`
`
`CTL
`
`CLK
`
`PWROK
`
`RESET#
`
`LDTSTOP#
`
`LDTREQ#
`
`7 Comands, Addresses and Data:
`Carries command, address, or data
`information.
`
`Control:
`Used to distinguish control packets from
`data packets.
`
`Clock:
`Fonlvarded clock signal.
`
`Power OK:
`Power and clocks are stable.
`
`HyperTransport Technology Reset:
`Resets the chain.
`
`HyperTransport Technology Stop:
`Enables and disables links during
`system state transitions.
`
`HyperTransport Technology Request:
`Requests re-enabling links for normal
`operation.
`
`CAD width
`
`n be different in each direction.
`
`7
`
`if N
`
`Each byte of CAD has a separate clock signal.
`Data is transferred on each clock edge.
`
`Single-ended.
`
`Single-ended.
`
`Used in systems requiring power management.
`Single-ended.
`
`Used in systems requiring power management.
`Single-ended.
`
`
`
`
`
`
`
`Page 10
`
`HyperTransportTM Technology |/O Link
`
`July 20, 2001
`
`
`
` WHITE PAPER
`AMDEI
`
`Enhanced Low— Voltage Differential Signaling
`
`The signaling technology used in HyperTransport technology is a type of low voltage
`
`differential signaling (LVDS ). However, it is not the conventional IEEE LVDS standard.
`
`It is an enhanced LVDS technique developed to evolve with the performance of future
`
`process technologies. This is designed to help ensure that the HyperTransport technology
`
`standard has a long lifespan. LVDS has been widely used in these types of applications
`
`because it requires fewer pins and wires. This is also designed to reduce cost and power
`
`requirements because the transceivers are built into the controller chips.
`
`HyperTransport technology uses low-voltage differential signaling with differential
`
`impedance (ZOD) of 100 ohms for CAD, Clock, and Control signals, as illustrated in
`
`Figure 4. Characteristic line impedance is 60 ohms. The driver supply voltage is 1.2 volts,
`
`instead of the conventional 2.5 volts for standard LVDS. Differential signaling and the
`
`chosen impedance provide a robust signaling system for use on low—cost printed circuit
`
`boards. Common four-layer PCB materials with specified di-electric, trace, and space
`
`dimensions and tolerances or controlled impedance boards are sufficient to implement a
`
`HyperTransport I/O link. The differential signaling permits trace lengths up to 24 inches
`
`for 800 Mbit/s operation.
`
`
`
`Figure 4. Enhanced Low-Voltage Differential Signaling (LVDS)
`
`Page 11
`
`HyperTransportTM Technology |/O Link
`
`July 20, 2001
`
`
`
` WHITE PAPER
`AMDL'I
`
`At first glance, the signaling used to implement a HyperTransport I/O link would
`
`seem to increase pin counts because it requires two pins per bit and uses separate
`
`upstream and downstream data paths. However, the increase in signal pins is offset by
`
`two factors:
`
`El By using separate data paths, HyperTransport I/O links are designed to operate at
`
`much higher frequencies than existing bus architectures. This means that buses
`
`delivering equivalent or better bandwidth can be implemented using fewer signals.
`
`El Differential signaling provides a return current path for each signal, greatly reducing
`
`the number of power and ground pins required in each package.
`
`Greatly Increased Bandwidth
`
`Commands, addresses, and data traveling on a HyperTransport link are double—
`
`pumped, where transfers take place on both the rising and falling edges of the clock
`
`signal. For example, if the link clock is 800 MHZ, the data rate is 1600 MHz.
`
`E] An implementation of HyperTransport links with 16 CAD bits in each direction with
`
`a 1.6-GHZ data rate provides bandwidth of 3.2 Gigabytes per second in each
`
`direction, for an aggregate peak bandwidth of 6.4 Gbytes/s, or 48 times the peak
`
`bandwidth of a 33-MHZ PCI bus.
`
`El A low-cost, low-power HyperTransport link using two CAD bits in each direction
`
`and clocked at 400 MHz provides 200 Mbytes/s of bandwidth in each direction, or
`
`nearly four times the peak bandwidth of PCI 32/33.
`
`Such a link can be implemented with just 24 pins, including power and ground pins,
`
`as shown in Table 3.
`
`Table 3. Total Pins Used for Each Link Width
`
`
`‘Jil'
`‘c‘fl'
`‘
`1‘. inf" ‘
`,
`_
`_
`,
`
`Data Pins (total)
`8
`16
`32
`64
`128
`
`Clock Pins (total)
`
`Control Pins (total)
`
`Subtotal (High Speed)
`
`VLDT
`GND
`
`PWROK
`
`RESET#
`
`Total Pins
`
`4
`
`4
`
`16
`
`2
`4
`
`1
`
`1
`
`4
`
`4
`
`24
`
`2
`6
`
`1
`
`1
`
`4
`
`4
`
`40
`
`3
`1O
`
`1
`
`1
`
`8
`
`4
`
`76
`
`6
`19
`
`1
`
`1
`
`16
`
`4
`
`1 48
`
`10
`37
`
`1
`
`1
`
`24
`
`34
`
`55
`
`103
`
`197
`
`Page 12
`
`HyperTransportTM Technology I/O Link
`
`July 20, 2001
`
`
`
`AMDL‘I
`WHITE PAPER
`
`Data Link Layer
`
`The data link layer includes the initialization and configuration sequence, periodic
`
`cyclic redundancy check (CRC), disconnect/reconnect sequence, information packets for
`
`flow control and error management, and doubleword framing for other packets. These
`
`topics are discussed in detail in the HyperTransportTM Technology Input/Output Link
`
`Protocol Specification (#23888). This section of the white paper includes a brief
`
`discussion of the initialization process supported by HyperTransport technology.
`
`Initialization
`
`HyperTransport technology-enabled devices with transmitter and receiver links of
`
`equal width can be easily and directly connected. Devices with asymmetric data paths can
`
`also be linked together easily. Extra receiver pins are tied to logic 0, while extra
`
`transmitter pins are left open. During power-up, when RESET# is asserted and the
`
`Control signal is at logic 0, each device transmits a bit pattern indicating the width of its
`
`receiver. Logic within each device determines the maximum safe width for its
`
`transmitter. While this may be narrower than the optimal width, it provides reliable
`
`communications between devices until configuration software can optimize the link to
`
`the widest common width.
`
`For applications that typically send the bulk of the data in one direction, component
`
`vendors can save costs by implementing a wide path for the majority of the traffic and a
`
`narrow path in the lesser used direction. Devices are not required to implement equal—
`
`width upstream and downstream links.
`
`Protocol and Transaction Layers
`
`The protocol layer includes the commands, the virtual channels in which they run,
`
`and the ordering rules that govern their flow. The transaction layer uses the elements
`
`provided by the protocol layer to perform actions, such as read request and responses.
`
`These topics are discussed in more detail in the HyperTransportTM Technology
`
`Input/Output Link Protocol Specification (#23 888).
`
`Page 13
`
`HyperTransportTM Technology l/O Link
`
`July 20, 2001
`
`
`
`
`
`
`
`
`
`
`
`
`
`AMDZ'I
`WHITE PAPER
`
`Commands
`
`All HyperTransport technology commands are either four or eight bytes long and
`
`begin with a 6-bit command type field. The most commonly used commands are Read
`
`Request, Read Response, and Write. The basic commands are summarized in Table 4,
`
`listed by virtual channel. A Virtual channel contains requests or responses with the same
`
`ordering priority.
`
`
`Table 4. Basic HyperTransportTM Technology Commands
`
`rlll'lllle lfl’lle Mini
`
`Posted
`
`
`
`(JOIiriiiral'Lti
`l
`i3.:oli‘il'i’ni=l1i
`Posted Write
`Followed by data packet(s).
`
`
`
`Broadcast
`
`issued by host bridge downstream to communicate
`Information to all devrces.
`
`i \
`
`
`
`
`
`Non-Posted
`
`Non-Posted Write
`
`
`
`All posted requests in a stream cannot pass it.
`
`Designates whether response can pass posted
`requests or not
`
`
`
`Flush
`
`Forces all posted requests to complete
`
`Atomic Read-Modify—Write
`
`Generated by IIO devrces or bridges and directed to
`system memory controlled by the host.
`
`
`Responses
`Read Response
`
`
`
`Target Done
`
`
`Response to read command Is followed by data
`packet(s).
`
`A transaction not requrrlng returned data has
`completed at its target.
`
`Figure 5 shows the command format.
`
`7
`
`”67 ,
`
`,
`
`5
`
`4
`
`3
`
`2
`
`1
`
`0 R
`
`ead and Write Command Address
`
`Bit Time
`
`Nmm-th—x
`
`Figure 5. Command Format
`
`When the command requires an address, the last byte of the command is concatenated
`
`with an additional four bytes to create a 40-bit address.
`
`Page 14
`
`HyperTransportT'V‘ Technology IIO Link
`
`July 20, 2001
`
`
`
`
`
`AMDEI
`WHITE PAPER
`
`Data Packets
`
`A Write command or a Read Response command is followed by data packets. Data
`
`packets are four to 64 bytes long in four-byte increments. Figure 6 shows a packet of
`
`eight bytes.
`
`
`Bit Time
`o
`
`7
`
`6
`
`5
`
`4
`
`3
`l
`
`i
`
`2
`
`1
`
`0
`
` Nam-#b-JMA
`
`Figure 6. Eight-Byte Data Packet
`
`Transfers of less than four bytes are padded to the four-byte minimum. Byte-
`
`granularity reads and writes are supported with a four-byte mask field preceding the data.
`
`This is useful when transferring data to or from graphics frame buffers where the
`
`application should only affect certain bytes that may correspond to one primary color or
`
`other characteristics of the displayed pixels. A control bit in the command indicates
`
`whether the writes are byte or doubleword granularity.
`
`Address Mapping
`
`Reads and writes to PCI I/O space are mapped into a separate address range,
`
`eliminating the need for separate memory and 1/0 control lines or control bits in read and
`
`write commands.
`
`Additional address ranges are used for in-band signaling of interrupts and system
`
`management messages. A device signaling an interrupt performs a byte-granularity write
`
`command targeted at the reserved address space. The host bridge is responsible for
`
`delivery of the interrupt to the internal target.
`
`I/0 Stream Identification
`
`Communications between the HyperTransport host bridge and other HyperTransport
`
`technology-enabled devices use the concept of streams. A HyperTransport link can
`
`handle multiple streams between devices simultaneously. HyperTransport technology
`
`devices are daisy-chained, so that some streams may be passed through one node to the
`
`next.
`
`Page 15
`
`HyperTransportTM Technology I/O Link
`
`July 20, 2001
`
`
`
`AMDEI
`WHITE PAPER
`
`Packets are identified as belonging to a stream by the Unit ID field in the packet
`
`header, as shown in Figure 7. There can be up to 32 unique IDs within a HyperTransport
`
`chain. Nodes within a HyperTransport chain may contain multiple units.
`
`Commands,
`Addresses,
`and Data
`
`HyperTransport- Single
`to-PCI-XTunneI UnitlD
`
`Unit IDs
`
`IIO Hub
`
`Multiple
`
`Reads and
`Responses
`
`Figure 7. IIO Streams Use Unit IDs
`
`It is the responsibility of each node to determine if information sent to it is targeted at
`
`a device within it. If not, the information is passed through to the next node. If a device is
`
`located at the end of the chain and it is not the target device, an error response is passed
`
`back to the host bridge.
`
`Commands and responses sent from the host bridge have a Unit ID of zero.
`
`Commands and responses sent from other HyperTransport technology devices on the
`
`chain have their own unique ID.
`
`If a bus-mastering HyperTransport technology device like a RAID controller sends a
`
`write command to memory above the host bridge, the command will be sent with the Unit
`
`ID of the RAID controller. HyperTransport technology permits posted write operations so
`
`that these devices do not wait for an acknowledgement before proceeding. This is useful
`
`for large data transfers that will be buffered at the receiving end.
`
`Ordering Rules
`
`Within streams, the HyperTransport I/O link protocol implements the same basic
`
`ordering rules as PCI. Additionally, there are features that allow these ordering rules to be
`
`relaxed. A Fence command aligns posted cycles in all streams, and a Flush command
`
`flushes the posted write channel in one stream. These features are helpful in handling
`
`protocols for bridges to other buses such as PCI, InfiniBand, AGP.
`
`Page 16
`
`HyperTransportTM Technology |/O Link
`
`July 20, 2001
`
`
`
` WHITE PAPER
`AMDE‘I
`
`Session Layer
`
`The session layer includes link width optimization and link frequency optimization
`
`along with interrupt and power state capabilities. These topics are discussed in more
`
`detail in the HyperTransportTM Technology Input/Output Link Protocol Specification
`
`(#23 888).
`
`Standard Plug ‘11 Play Conventions
`
`Devices enabled with HyperTransport technology use standard “Plug ‘n Play”
`
`conventions for exposing the control registers that enable configuration routines to
`
`optimize the Width of each data path. AMD registered the HyperTransport Specific
`
`Capabilities Block with the PCI SIG. This Capabilities Block, illustrated in Figure 8,
`
`permits devices enabled with HyperTransport technology to be configured by any
`
`operating system that supports a PCI architecture.
`
`e.g., ACPI
`
`Other Capabilities Block(s)
`
`l Capabilities Pointer
`
`Figure 8. HyperTransportT'V' Technology Capabilities Block
`
`Since system enumeration and power-up are implementation-specific, it is assumed
`
`that system firmware will recognize the Capabilities Block and use the information
`
`within it to configure all HyperTransport host bridges in the system. Once the host
`
`bridges are identified, devices enabled with HyperTransport technology that are
`
`connected to the bridges can be enumerated just as they are for PCI devices.
`
`Configuration information that is collected and the structures created by this process Will
`
`look to a Plug ‘11 Play-aware operating system (OS) just like those of PCI devices. In
`
`short, the Plug ‘11 Play-aware OS does not require any modification to recognize and
`
`configure devices enabled with HyperTransport technology.
`
`Page 17
`
`HyperTransportT'V‘ Technology I/O Link
`
`July 20, 2001
`
`
`
` WHITE PAPER
`AMDZ‘I
`
`Minimal Device Driver Porting
`
`Drivers for devices enabled with HyperTransport technology are unique to the
`
`devices just as they are to PCI I/O devices, but the similarities are great. Companies that
`
`build a PCI I/O device and then create an equivalent device enabled with HyperTransport
`
`technology should have no problems porting the driver. To make porting easier, the chain
`
`from a host bridge is enumerated like a PCI bus, and devices and functions within a
`
`device enabled with HyperTransport technology are enumerated like PCI devices and
`
`functions, as shown in Figure 9.
`
`HyperTransport HyperTransport
`Chain (BUS 0)
`PCI-X Tunnel
`
`PCI'X (Bus 1)
`
`POI-32 (Bus 2)
`
`Figure 9. Familiar Bus and Device Enumeration
`
`Link Width Optimization
`
`The initial link-width negotiation sequence may result in links that do not operate at
`
`their maximum width potential. All 16-bit, 32-bit, and asymmetrically-sized
`
`configurations must be enabled by a software initialization step. At cold reset, all links
`
`power-up and synchronize according to the protocol. Firmware (or BIOS) then
`
`interrogates all the links in the system, reprograms them to the desired width, and takes
`
`the system through a warm reset to change the link widths.
`
`Devices that implement the LDTSTOP# signal can disconnect and reconnect rather
`
`than enter warm reset to invoke link width changes.
`
`Page 18
`
`HyperTransportTM Technology |/O Link
`
`July 20, 2001
`
`
`
`AMDI‘J
`WHITE PAPER
`
`Link Frequency Initialization
`
`At cold reset, all links power-up with ZOO-MHZ clocks. For each link, firmware reads
`
`a specific register of each device to determine the supported clock frequencies. The
`
`reported frequency capability, combined with system-specific information about the
`
`board layout and power requirements, is used to determine the frequency to be used for
`
`each link. Firmware then writes the two frequency registers to set the frequency for each
`
`link. Once all devices have been configured, firmware initiates an LDTSTOP# disconnect
`
`or RESET# of the affected chain to cause the new frequency to take effect.
`
`Implementation Examples
`
`Daisy Chain
`
`HyperTransport technology has a daisy-chain topology, giving the opportunity to
`
`connect multiple HyperTransport input/output bridges to a singl