throbber
APPLE 1007
`
`1
`
`APPLE 1007
`
`

`
`Interconnection Networks
`
`An Engineering Approach
`
`José Duato
`
`Sudhakar Yalamanchili
`
`Lionel Ni
`
`IEEE
`COMPUTER
`SOCIETY
`
`Los Alamitos, California
`
`Washington
`
`'
`
`Brussels
`

`
`Tokyo
`
`2
`
`

`
`Library oi Congress Cataloging-in-Publication Data
`
`Duato,José
`interconnection networks: an engineering approach /José Duato, Sudhakar
`Yalamanchili, Lionel Ni.
`p.
`cm.
`Includes bibliographical references.
`ISBN 0-8186-7800-3
`1. Computer networks.’
`n. Ni, Lionel M.
`|EEii¥ifi%?:iti‘%‘”‘
`TK5105.5.D88
`1997
`7
`004.6-dc21
`
`2 Multgiprocessgr.
`*~
`
`|.YalamanchiIi.
`
`97-20502
`CIP
`
`Engineers, Inc. All rights reserved.
`Copyright © 1997 by The
`Copyright and Reprint Permissions.‘ Abstracting is permitted with credit to the source. Libraries are permitted
`to photocopy isolated pages beyond the limits of US copyright law, for private use of their patrons. Other
`copying, reprint, or republication requests» should be addressed to: IEEE Copyrights Manager, IEEE Service
`Center, 445 Hoes Lane, P.0. Box 1331, Piscataway, NJ 08855-1331.
`
`IEEE Computer Society Press Order Number BPO7 800
`Library of Congress Number 97-20502
`ISBN 0-8186-7800-3
`
`Additional copies may be ordered from:
`
`IEEE Computer Society Press
`Customer Service Center
`10662 Los Vaqueros Circle
`P.0. Box 3014
`Los Alamitos, CA 90720-1314
`Tel: +1-714-821-8380
`Fax: +1-714-821-4641
`Email: cs.books @ computer. org
`
`IEEE Service Center
`445 Hoes Lane
`P.0. Box 1331
`Piscataway, NJ 08855-1331
`Tel: +1-908-981-1393
`Fax: +1-908-981-9667
`rnis.custserv@oomputer.org
`
`IEEE Computer Society
`13, Avenue de 1’Aqu'11on
`13-1200 Bnrssels
`BELGIUM
`Tel: +32-2-770-2198
`Fax: +32—2-770-8505
`euro.ofc@computer.org
`
`IEEE Computer Society
`Ooshima Building
`2-19-1 Minnmi—Aoyama
`Minato-ku, Tokyo 107
`JAPAN
`Tel: +81-3-3408-3118
`Fax: +81-3-3408-3553
`tokyo.ofc@computer.0rg
`
`Editor-in-Chief: Mohamed Fayad
`Publisher: Matt Loeb
`Acquisitions Editor: Bill Sanders
`Developmental Editor; Cheryl Baltes
`Advertising/Promotions: Tom Fink
`Production Editor: Lisa 0’Conner
`Cover Artist: John Marshall
`
`Interconnection Networks by John Marshall, oil on canvass, 1997, is based on a concept by
`About the cover:
`José Duato. The painting includes symbols representing important elements in the field. To find out more about
`the symbols in the painting, visit the book‘ s website at http://computer.org/books.
`Printed in the United States of America
`
`lEE
`COMPUTER 9
`SOCIETY
`
`3
`
`

`
`Chapter 1
`
`A Introduction
`
`Interconnection networks are currently being used for many different applications, ranging from
`internal buses in very large—scale integration (VLSI) circuits to wide area computer networks. Among
`others, these applications include backplane buses and system area networks, telephone switches,
`internal networks for asynchronous transfer mode (ATM) switches, processor/ memory interconnects
`for vector supercomputers, interconnection networks for multicomputers and distributed shared-
`memory multiprocessors, clusters of workstations, local area networks, metropolitan area networks,
`wide area computer networks, and networks for industrial applications. Additionally, the number of
`applications requiring interconnection networks is continuously growing. For example, an integral
`control system for a car requires a network connecting several microprocessors and devices.
`The characteristics and cost of these networks considerably depend on the application. There are
`no general solutions. For some applications, interconnection networks have been studied in depth for
`decades. This is the case for telephone networks, computer networks, and backplane buses. These
`networks are covered in many books. However, there are some other applications that have not been
`fully covered in the existing literature. This is the case for the interconnection networks used in
`multicomputers and distributed shared—memory multiprocessors.
`The lack of standards and the need for very high performance and reliability pushed the develop-
`ment of interconnection -networks for multicomputers. This technology was transferred to distributed
`shared—memory multiprocessors, improving the scalability of those machines. However, distributed
`shared—memory multiprocessors require an even higher network performance than multicomputers,
`pushing the development of interconnection networks even more. More recently, this network tech-
`nology began to be transferred to local area networks (LANs). Also, it has been proposed as a
`replacement for backplane buses, creating the concept of system area network. Hence, the advances
`in interconnection networks for multicomputers are the basis for the development of interconnection
`networks for other architectures and environments. Therefore, there is a need for structuring the
`concepts and solutions for this kind of interconnection networks. Obviously, when this technology
`is transferred to another environment, new issues arise that have to be addressed.
`Moreover, several of these networks are evolving very quickly, and the solutions proposed for
`different kinds of networks are overlapping. Thus, there is a need for formally stating the basic
`concepts, the alternative design choices, and the design trade—o£fs for most of those networks. In this
`book, we take that challenge and present in a structured way the basic underlying concepts of most
`interconnection networks, as well as the most interesting solutions currently implemented or proposed
`in the literature. As indicated above, the network technology developed for multicomputers has
`been transferred to other environments. Therefore, in this book we will mainly describe techniques
`
`4
`
`

`
`CHAPTER 1.
`
`INTRODUCTION
`
`developed for multicomputer networks. Most of these techniques can also be applied to distributed
`shared-memory multiprocessors, and to local and system area networks. However, we will also
`describe techniques specifically developed for these environments.
`
`1.1 Parallel Computing and Networks
`
`The demand for even more computing power has never stopped. Although the performance of
`processors has doubled in approximately every three-year span from 1980 to 1996, the complexity
`. of the software as well as the scale and solution quality of applications have continuously driven
`the development of even faster processors. A number of important problems have been identified
`in the areas of defense, aerospace, automotive applications, and science, whose solution requires
`tremendous amount of computational power. In order to solve these grand challenge problems, the
`goal has been to obtain computer systems capable of computing at the teraflops (1012 floating-point
`operations per second) level. Even the smallest of these problems requires gigafiops (109 floating-
`point operations per second) of performance for hours at a time. The largest problems require
`terafiops performance for more than a thousand hours at a time.
`Parallel computers with multiple processors are opening the door to teraflops computing per-
`formance to meet the increasing demand of computational power. The demand includes more
`computing power, higher network and input /output (I/O) bandwidths, and more memory and stor-
`age capacity. Even for applications requiring a lower computing power, parallel computers can be
`a cost-effective solution. Processors are becoming very complex. As a consequence, processor de-
`sign cost is growing so fast that only a few companies all over the World can afford to design a
`new processor. Moreover, design cost should be amortized by selling a very high number of units.
`Currently, personal computers and workstations dominate the computing market. Therefore, de-
`signing custom processors that boost the performance one order of magnitude is not cost-effective.
`Similarly, designing and manufacturing high-speed memories and disks is not cost-effective. The
`alternative choice consists of designing parallel computers from commodity components (processors,
`memories, disks, interconnects, etc.).
`In these parallel computers, several processors cooperate to
`solve a large problem. Memory bandwidth can be scaled with processor computing power by phys-
`ically distributing memory components among processors. Also, redundant arrays of inexpensive
`disks (RAID) allow the implementation of high-capacity reliable parallel file systems meeting the
`performance requirements of parallel computers.
`However, a parallel computer requires some kind of communication subsystems to interconnect
`processors, memories, disks and other peripherals. The specific requirements of these communication
`subsystems depend on the architecture of the parallel computer. The simplest solution consists of
`connecting processors to memories and disks as if there were a single processor, using system buses
`and I/O buses. Then, processors can be interconnected using the interfaces to local area networks.
`Unfortunately, commodity communication subsystems have been designed to meet a different set
`of requirements, i.e., those arising in computer networks. Although networks of workstations have
`been proposed as an inexpensive approach to build parallel computers, the communication subsystem
`becomes the bottleneck in most applications.
`Therefore, designing high-performance interconnection networks becomes a critical issue to ex-
`ploit the performance of parallel computers. Moreover, as the interconnection network is the only
`subsystem that cannot be efficiently implemented by using commodity components, its design be-
`comes very critical. This issue motivated the writing of this book. Up to now, most manufacturers
`designed custom interconnection networks (nCUBE-2, nCUBE-3, Intel Paragon, Cray T3D, Cray
`T3E, Thinking Machines Corp. CM-5, NEC Cenju-3, IBM SP2). More recently, several high-
`
`5
`
`

`
`1.2. PARALLEL COIWPUTER ARCHITECTURES
`
`Interconnection Network
`
`TT“TPlP P
`i
`l T
`lnterconnection Network

`l
`1;
`t
`l
`
`(a)
`
`(b)
`
`(a) A multicomputer.
`Figure 1.1. Schematic representation of parallel computers:
`shared—memory multiprocessor.
`(M = memory; P = processor.)
`
`(b) A UMA
`
`performance switches have been developed (Autonet, Myrinet, ServerNet) and are being marketed.
`These switches are targeted to workstations and personal computers, offering the customer the pos-
`sibility of building an inexpensive parallel computer by connecting cost—efl’ective computers through
`high—performance switches. The main issues arising in the design of networks for both approaches
`are covered in this book.
`
`1.2 Parallel Computer Architectures
`
`In this section, we briefly introduce the most popular parallel computer architectures. This descrip-
`tion will focus on the role of the interconnection network. A more detailed description is beyond the
`scope of this book.
`The idea of using commodity components for the design of parallel computers led to the de-
`velopment of distributed-memory multiprocessors, or multicomputers in early 1980s. These parallel
`computers consist of a set of processors, each one connected to its own local memory. Processors
`communicate between them by passing messages through an interconnection network. Figure 1.1a
`shows a simple scheme for this architecture. The first commercial multicomputers utilized commodity
`components, including Ethernet controllers to implement communication between processors. Un-
`fortunately, commodity communication subsystems were too slow, and the interconnection network
`became the bottleneck of those parallel computers. Several research efforts led to the development of
`interconnection networks that are several orders of magnitude faster than Ethernet networks. Most
`of the performance gain is due to architectural rather than technological improvements.
`Programming multicomputers is not an easy task. The programmer has to take care of dis-
`tributing code and data among the processors in an efficient way, invoking message-passing calls
`whenever some data are needed by other processors. On the other hand, shared-memory multi-
`processors provide a single memory space to all the processors, simplifying the task of exchanging
`data among processors. Access to shared memory has been traditionally implemented by using an
`interconnection network between processors and memory (Figure 1.1b). This architecture is referred
`to as umform memory access (UMA) architecture. It is not scalable because memory access time
`includes the latency of the interconnection network, and this latency increases with system size.
`More recently, shared-memory multiprocessors followed some trends previously established for
`multicomputers. In particular, memory has been physically distributed among processors, therefore
`reducing the memory access time for local accesses and increasing scalability. These parallel comput-
`ers are referred to as distributed shared—memory multiprocessors (DSM). Accesses to remote memory
`
`6
`
`

`
`CHAPTER 1.
`
`INTRODUCTION
`
`are performed through an interconnection network, very much like in multicomputers. The main
`difference between DSMs and multicomputers is that messages are initiated by memory accesses
`rather than by calling a system function. In order to reduce memory latency, each processor has
`several levels of cache memory, thus matching the speed of processors and memories. This architec-
`ture provides nonuniform memory access (NUMA) time. Indeed, most of the nonuniformity is due
`to the different access time between caches and main memories, rather than the different access time
`between local and remote memories. The main problem arising in DSMs is cache coherence. Several
`hardware and software cache coherence protocols have been proposed. These protocols produce
`additional trafiic through the interconnection network.
`The use of custom interconnects makes multicomputers and DSMs quite expensive. So, networks
`of workstations (NOW) have been proposed as an inexpensive approach to build parallel computers.
`NOWs take advantage of recent developments in LANs.
`In particular, the use of ATM switches
`has been proposed to implement NOWS. However, ATM switches are still expensive, which has
`motivated the development of high-performance switches, specifically designed to provide a cost-
`effective interconnect for workstations and personal computers.
`Although there are many similarities between interconnection networks for multicomputers and
`DSMs, it is important to keep in mind that performance requirements may be very different. Mes-
`sages are usually very short when DSMs are used. Additionally, network latency is important
`because memory access time depends on that latency. However, messages are typically longer and
`less frequent when using multicomputers. Usually the programmer is able to adjust the granularity
`of message communication in a multicomputer. On the other hand, interconnection networks for
`multicomputers and NOWs are mainly used for message passing. However, the geographical dis-
`tribution of workstations usually imposes constraints on the way processors are connected. Also,
`individual processors may be connected to or disconnected from the network at any time, thus
`imposing additional design constraints.
`
`1.3 Network Design Considerations
`
`Interconnection networks play a major role in the performance of modern parallel computers. There
`are many factors that may affect the choice of an appropriate interconnection network for the
`underlying parallel computing platform. These factors include:
`
`1. Performance requirements. Processes executing in different processors synchronize and com-
`municate through the interconnection network. These operations are usually performed by
`explicit message passing or by accessing shared variables. Message latency is the time elapsed
`between the time a message is generated at its source node and the time the message is deliv-
`ered at its destination node. Message latency directly affects processor idle time and memory
`access time to remote memory locations. Also, the network may saturate — it may be unable
`to deliver the flow of messages injected by the nodes, limiting the effective computing power of
`a parallel computer. The maximum amount of information delivered by the network per time
`unit defines the throughput of that network.
`
`. Scalability. A scalable architecture implies that as more processors are added, their memory
`bandwidth, I / O bandwidth, and network bandwidth should increase proportionally. Otherwise
`the components whose bandwidth does not scale may become a bottleneck for the rest of the
`system, decreasing the overall efficiency accordingly.
`
`. Incremental expandability. Customers are unlikely to purchase a parallel computer with a full
`set of processors and memories. As the budget permits, more processors and memories may be
`
`7
`
`

`
`1.3. NETWORK DESIGN CONSIDERATIONS
`
`added until a system’s maximum configuration is reached. In some interconnection networks,
`the number of processors must be a power of 2, which makes them difficult to expand.
`In
`other cases, expandability is provided at the cost of wasting resources. For example, a network
`designed for a maximum size of 1,024 nodes may contain many unused communication links
`when the network is implemented with a smaller size. Interconnection networks should provide
`incremental expandability, allowing the addition of a small number of nodes while minimizing
`resource wasting.
`
`. Partitionability. Parallel computers are usually shared by several users at a time. In this case,
`it is desirable that the network traffic produced by each user does not affect the performance
`of other applications. This can be ensured if the network can be partitioned into smaller
`functional subsystems. Partitionability may also be required for security reasons.
`
`. Simplicity. Simple designs often lead to higher clock frequencies and may achieve higher
`performance. Additionally, customers appreciate networks that are easy to understand because
`it is easier to exploit their performance.
`
`. Distance span. This factor may lead to very different implementations. In multicomputers and
`DSMS, the network is assembled inside a few cabinets. The maximum distance between nodes
`is small. As a consequence, signals are usually transmitted using copper wires. These wires
`can be arranged regularly, reducing the computer size and wire length. In NOWS, links have
`very different lengths and some links may be very long, producing problems such as Coupling,
`electromagnetic noise, and heavy link cables. The use of optical links solves these problems,
`equalizing the bandwidth of short and long links up to a much greater distance than when
`copper wire is used. Also, geographical constraints may impose the use of irregular connection
`patterns between nodes, making distributed control more diflicult to implement.
`
`. Physical constraints. An interconnection network connects processors, memories, and/or I/O
`devices. It is desirable for a network to accommodate a large number of components while
`maintaining a low communication latency. As the number of components increases, the number
`of wires needed to interconnect them also increases. Packaging these components together
`usually requires meeting certain physical constraints, such as operating temperature control,
`wiring length limitation, and space limitation. Two major implementation problems in large
`networks are the arrangement of wires in a limited area, and the number of pins per chip (or
`board) dedicated to communication channels. In other words, the complexity of the connection
`is limited by the maximum wire density possible, and by the maximum pin count. The speed
`at which a machine can run is limited by the wire lengths, and the majority of the power
`consumed by the system is used to drive the wires. This is an important and challenging issue
`to be considered. Diflerent engineering technologies for packaging, wiring, and maintenance
`should be considered.
`
`. Reliability and repairability. An interconnection network should be able to deliver information
`reliably. Interconnection networks can be designed for continuous operation in the presence of
`a limited number of faults. These networks are able to send messages through alternative paths
`when some faults are detected. In addition to reliability, interconnection networks should have
`a modular design, allowing hot upgrades and repairs. Nodes can also fail or be removed from
`the network.
`In particular, a node can be powered off in a network of workstations. Thus,
`NOWS usually require some reconfiguration algorithm for the automatic reconfiguration of the
`network when a node is powered on or ofl".
`
`8
`
`

`
`CHAPTER 1.
`
`INTROD UCTION
`
`9. Expected workloads. Users of a general-purpose machine may have very different requirements.
`If the kind of applications that will be executed in the parallel computer are known in advance,
`it may be possible to extract some information on usual communication patterns, message
`sizes, network load, etc. That information can be used for the optimization of some design
`parameters. VVhen it is not possible to get information on expected workloads, network design
`should be robust, i.e., design parameters should be selected in such a way that performance is
`good over a wide range of traflic conditions.
`
`. Cost constraints. Finally, it is obvious that the “best” network may be too expensive. Design
`decisions very often are trade-ofi"s between cost and other design factors. Fortunately, cost
`is not always directly proportional to performance. Using commodity components whenever
`possible may considerably reduce the overall cost.
`
`1.4 Classification of Interconnection Networks
`
`Among other criteria, interconnection networks have been traditionally classified according to the
`operating mode (synchronous or asynchronous), and network control (centralized, decentralized,
`or distributed). Nowadays, multicomputers, multiprocessors, and NOWS dominate the parallel
`computing market. All of these architectures implement asynchronous networks with distributed
`control. Therefore, we will focus on other criteria that are currently more significant.
`A classification scheme is shown in Figure 1.2 which categorizes the known interconnection net-
`works into four major classes based primarily on network topology: shared-medium networks, direct
`networks, indirect networks, and hybrid networks. For each class, the figure shows a hierarchy of
`subclasses, also indicating some real implementations for most of them. This classification scheme
`is based on the classification proposed in [252], and it mainly focuses on networks that have been
`implemented. It is by no means complete as other new and innovative interconnection networks may
`emerge as technology further advances, such as mobile communication and optical interconnections.
`In shared-medium networks, the transmission medium is shared by all communicating devices.
`An alternative to this approach consists of having point-to-point links directly connecting each com-
`municating device to a (usually small) subset of other communicating devices in the network.
`In
`this case, any communication between nonneighboring devices requires transmitting the informa-
`tion through several intermediate devices. These networks are known as direct networks. Instead
`of directly connecting the communicating devices between them,
`indirect networks connect those
`devices by means of one or more switches.
`If several switches exist, they are connected between
`them using point-to-point links. In this case, any communication between communicating devices
`requires transmitting the information through one or more switches. Finally, hybrid approaches are
`possible. These network classes and the corresponding subclasses will be described in the following
`sections.
`
`1.5 Shared-Medium Networks
`
`The least complex interconnect structure is one in which the transmission medium is shared by
`all communicating devices. In such shared-medium networks, only one device is allowed to use the
`network at a time. Every device attached to the network has requester, driver, and receiver circuits
`to handle the passing of address and data. The network itself is usually passive, since the network
`itself does not generate messages.
`
`9
`
`

`
`1.5. SHARED-MEDIUM NETWORKS
`
`Interconnection Networks
`
`—— Shared-Medium Networks
`
`Local Area Networks
`
`T Contention Bus (Ethernet)
`Tokcn Bus (Arcnct)
`T Token Ring (FDDI Ring, IBM Token Ring)
`
`Backplane Bus (Sun Gigaplane, DEC AlphaServer8X0O, SGI PowerPath-2)
`
`Direct Networks (Router-B ased Networks)
`
`Strictly Orthogonal Topologies
`Mesh
`
`T 2-D Mesh (Intel Paragon)
`T 3-D Mesh (MIT J-Machine)
`T Torus (k-ary n-cube)
`l-D Unidirectional Torus or Ring (KSR first-level ring)
`T 2-D Bidirectional Torus (Intel/CMU iWarp)
`T 3-D Bidirectional Torus (Cray T3D, Cray T3E)
`T Hypercube (Intel iPSC, nCUBE)
`
`Other Topologies: Trees, Cube-Connected Cycles, de Bruijn Network, Star Graphs, etc.
`
`—— Indirect Networks (Switch-Based Networks)
`
`T~ Regular Topologies
`T Crossbar (Cray X/Y-MP, DEC GIGAswitch, Myrinet)
`T Multistage Interconnection Networks
`
`T Blocking Networks
`i E Unidirectional MIN (NEC Ccnju-3, IBM RP3)
`Bidirectional MIN (IBM SP, TMC CM-5, Meiko CS—2)
`Nonblocking Networks: Clos Network
`
`T Irregular Topologies (DEC Autonet, Myrinet, ServerNet)
`
`——- Hybrid Networks
`
`T Multiple-Backplane Buses (Sun XDBus)
`
`T Hierarchical Networks (Bridged LANS, KSR)
`T Cluster—Based Networks (Stanford DASH, HP/Convex Exemplar)
`
`T Other Hypergraph Topologies: Hypcrbuscs, Hypermeshes, etc.
`
`(1-D = one-dimensional; 2-D : two-
`Figure 1.2. Classification of interconnection networks.
`dimensional; 3-D = three-dimensional; CMU : Carnegie Mellon University; DASH = Directory
`Architecture for Shared-Memory; DEC = Digital Equipment Corp.; FDDI = Fiber Distributed
`Data Interface; HP = Hewlett-Packard; KSR : Kendall Square Research; MIN = Multistage Inter-
`connection Network; MIT = Massachusetts Institute of Technology; SGI = Silicon Graphics Inc.;
`TMC = Thinking Machines Corp.)
`
`10
`
`

`
`CHAPTER 1.
`
`INTRODUCTION
`
`An important issue here is the arbitration strategy that determines the mastership of the shared—
`medium network to resolve network access conflicts. A unique characteristic of a shared medium
`is its ability to support atomic broadcast in which all devices on the medium can monitor network
`activities and receive the information transmitted on the shared medium. This property is important
`to efliciently support many applications requiring one-to-all or one-to—many communication services,
`such as barrier synchronization and snoopy cache coherence protocols. Due to limited network
`bandwidth, a single shared medium can only support limited number of devices before the medium
`becomes a bottleneck.
`Shared-medium networks constitute a well established technology. Additionally, their limited
`bandwidth restricts their use in multiprocessors. So, these networks will not be covered in this
`book, but we will present a short introduction in the following sections. There are two major classes
`of shared-medium networks: local area networks, mainly used to construct computer networks that
`span physical distances no longer than a few kilometers, and backplane buses, mainly used for
`internal communication in uniprocessors and multiprocessors.
`
`1.5.1 Shared-Medium Local Area Networks
`High—speed LANs can be used as a networking backbone to interconnect computers to provide an
`integrated parallel and distributed computing environment. Physically, a shared-medium LAN uses
`copper wires or fiber optics in a bit-serial fashion as the transmission medium. The network topology
`is either a bus or a ring. Depending on the arbitration mechanism used, different LANs have been
`commercially available. For performance and implementation reasons, it is impractical to have a
`centralized control or to have some fixed access assignment to determine the bus master who can
`access the bus. Three major classes of LANs based on distributed control are described below.
`
`Contention Bus
`The most popular bus arbitration mechanism is to have all devices to compete for the exclusive
`access right of the bus. Due to the broadcast nature of the bus, all devices can monitor the state
`of the bus, such as idle, busy, and collision. Here the term “collision” means that two or more
`devices are using the bus at the same time and their data collided. When the collision is detected,
`the competing devices will quit transmission and try later. The most well-known contention-based
`LAN is Ethernet which adopts carrier-sense multiple access with collision detection (CSMA/CD)
`protocol. The bandwidth of Ethernet is 10 Mbps and the distance span is 250 meters (coaxial cable).
`As processors are getting faster, the number of devices that can be connected to Ethernet is limited
`to avoid the network bottleneck. In order to break the 10 Mbps bandwidth barrier, Fast Ethernet
`can provide 100 Mbps bandwidth.
`
`Token Bus
`One drawback of the contention bus is its nondeterministic nature as there is no guarantee of how
`much waiting time is required to gain the bus access right. Thus, the contention bus is not suitable
`to support real-time applications. To remove the nondeterministic behavior, an alternate approach
`involves passing a token among the network devices. The owner of the token has the right to access
`the bus. Upon completion of the transmission, the token is passed to the next device based on
`some scheduling discipline. By restricting the maximum token holding time, the upper bound that
`a device has to wait for the token can be guaranteed. Arcnet supports token bus with a bandwidth
`of 2.5 Mbps.
`
`11
`
`

`
`1.5. SHARED-MEDIUM NETWORKS
`
`Figure 1.3. A single-bus network. (M = memory; P = processor.)
`
`Token Ring
`
`The idea of token ri.ng is a natural extension of token bus as the passing of the token forms a ring
`structure.
`IBM token ring supports bandwidths of both 4 and 16 Mbps based on coaxial cable.
`Fiber Distributed Data Interface (FDDI) provides a bandwidth of 100 Mbps using fiber optics.
`
`1.5.2 Shared-Medium Backplane Bus
`
`It is
`A backplane bus is the simplest interconnection structure for bus-based parallel computers.
`commonly used to interconnect processor(s) and memory modules to provide UMA architecture.
`Figure 1.3 shows a single-bus network. A typical backplane bus usually has 50 — 300 wires and is
`physically realized by printed lines on a circuit board or by discrete (backplane) wiring. Additional
`costs are incurred by interface electronics, such as line drivers, receivers, and connectors.
`There are three kinds of information in the backplane bus: data, address, and control signals.
`Control signals include bus request signal and request grant signal, among many others. In addition
`of the width of data lines, the maximum bus bandwidth that can be provided is dependent on the
`technology. The number of processors that can be put on a bus depends on many factors, such as
`processor speed, bus bandwidth, cache architecture, and program behavior.
`
`Methods of Information 'I‘ransfer
`
`Both data and address information must be carried in the bus. In order to increase the bus bandwidth
`
`and provide a large address space, both data width and address bits have to be increased. Such an
`increase implies another increase in the bus complexity and cost. Some designs try to share address
`and data lines. For multiplexed transfer, address and data are sent alternatively. Hence, they can
`share the same physical lines and require less power and fewer chips. For nonmultiplezed transfer,
`address and data lines are separated. Thus, data transfer can be done faster.
`In synchronous bus design, all devices are synchronized with a common clock. It requires less
`complicated logic and has been used in most existing buses. However, a synchronous bus is not
`easily upgradable. New faster processors are difficult to fit into a slow bus.
`In asynchronous buses, all devices connected to the bus may have different speeds and their own
`clocks. They use a handshaking protocol to synchronize with each other. This provides independence
`for different technologies and allows slower and faster devices with different clock rates to operate
`together. This also implies buffering is needed, since slower devices cannot handle messages as fast
`as faster devices.
`
`Bus Arbitration
`
`In a single-bus network, several processors may attempt to use the bus simultaneously. To deal with
`this, a policy must be implemented that allocates the bus to the processors making such requests.
`For performance reasons, bus allocation must be carried out by hardware arbiters. Thus, in order to
`perform a memory access request, the processor has to exclusively own the bus and become the bus
`
`12
`
`

`
`10
`
`CHAPTER 1.
`
`INTRODUCTION
`
`master. To become the bus master, each processor implements a bus requester, which is a collection
`of logic to req

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket