`
`BUNGIE - EXHIBIT 1023
`
`
`
`May 1985
`
`. m
`
`d
`,,.:
`
`.
`
`The 18th century copper
`engraving on this month's
`cover illustrates the rendez(cid:173)
`vous principle in the XMS
`distributed computing
`system described in the
`lead article. The work is
`fo rmal and structured, like
`the XMS network proto(cid:173)
`col. The suitor climbs a
`ladder to the rendezvous
`under the watchful eye of
`an observer, like a task
`that steps up to a higher(cid:173)
`level process under the
`guidance of the distributed
`software so it can interact
`with other system tasks.
`The system's resource
`manager at the least
`observes the rendezvous,
`but often steps in as a
`matchmaker or Cupid to
`make sure the right tasks
`meet.
`
`Engraving courtesy of the
`Bettmann Archive
`
`· Cover design by
`Jay Simpson
`
`Next issue:
`Operating Systems for
`Highly Parallel
`Environments
`
`May 1985
`
`Volume 2 Number 3 (ISSN 0740-7459)
`
`/
`
`THEME FEATURES
`5 Experiences with Distributed Systems
`Guest Editors' Introduction Stephen F. Lundstrom and Duncan H. Lawrie
`9 XMS: A Rendezvous-Based Distributed System Software Architec~ure
`Neil Gammage and Liam Casey
`XMS creates a single, powerful system from loosely coupled microcomputers.
`Programs work together across nodes, making systemwide resource management transparent
`and distributed-system design simpler.
`21 Helix: The Architecture of the XMS Distributed File System
`Marek Fridrich and William Older
`With abstraction layering and system decomposition, all the user sees is one homogeneous system .
`Behind the scene, the architecture is supporting 15 LANs and close to I 000 workstations.
`30 Amaze: A Multiplayer Computer Game
`Eric J. Berglund and David R. Cheriton
`Amaze relies solely on the V kernel for point-to-point communication.
`The game's techniques could work in a general class of distributed applications .
`40 High-Level Broadcast Communication for Local Area Networks
`Thomas J . LeBlanc and Robert P. Cook
`Even for LANs without broadcast-supporting hardware, this program offers improvements of
`from I. I to 7 .8 over point-to-point message transmission-and that's the worst-case gain.
`49 Multicast Communication on Network Computers
`Ariel J. Frank, Larry D. Wittie, and Arthur J. Bernstein
`Channel-oriented packet casting is a predominant feature of Micros, an operating system designed
`to explore control and communication techniques for net omputers with thousands of hosts.
`62 The ARC Network: A Case Study
`Mark C. Paulk
`Designed around VAX and developed for ballistic missile defense systems, this network offers
`error-free message passing and can absorb overhead unacceptable to general-purpose networks.
`
`SPECIAL FEATURES
`70 A Qualitative Assessment of Parallelism in Expert Systems
`Robert J. Douglass
`Developers envision expert systems that can make up to one billion inferences per second.
`This will require full utilization of a system's potential for parallel processing.
`83 Mycin: The Expert System and its Implementation in Loglisp
`Sanjai N[!rain
`A translation of Mycin, an expert system used in medical consultations, into Loglisp, a logic
`programming system, is comparable in terms of clarity, speed, and space requirements.
`
`DEPARTMENTS
`90 Software Standards: Selecting Software
`Documentation Standards
`92 New Products: An Account of One- Write Plus
`95 Product Highlights
`98 Personal & Peripheral Industry News
`99 Software Reviews: Telesoft Ada Version 1.3
`101 Soft News
`106 Professional Calendar
`107 Call for Papers
`
`107 Short Courses and Seminars
`108 Book Reviews: The Unix Programming
`Environment; Real World Unix; The
`Evolution of Programs; The Hacker's
`Dictionary; C Primer Plus; Design of
`Operating Systems
`112 Advertiser/Product Index
`Reader Service Cards, p. 112A;
`IEEE-CS Membership Application, p. 82.
`
`
`
`SOME ROBOTS CAN'T WALK AND
`CHEW GUM AT THE SAME TIME ...
`
`BUT THEN THEY PROBABLY NEVER
`HEARD OF FORTH EITHER.
`
`polyFORTH II® provides the software
`designer with the ability to tackle the most
`precise and intricate robotic-oriented func(cid:173)
`tions. By incorporating maximum flexibility
`and expandibility into one compact pack(cid:173)
`age, polyFORTH II has set the standard for
`state-of-the-art robotic software.
`polyFORTH II is "the language of
`choice" for robotic systems and front-end
`operations-control systems because Forth
`permits the programmer to fit the language
`to the problem. polyFORTH is keyed to
`real-time applications. Users can extend
`the system at all levels by adding modular
`commands and directives in simple task(cid:173)
`oriented English. Multiple users can be
`supported, and any number of asynchron(cid:173)
`ous tasks can run concurrently.
`Other real-time applications that "cry
`out" for polyFORTH II solutions include
`scientific and medical instrumentation, data
`acquisition and analysis, image processing,
`and manufacturing control.
`polyFORTH user-friendly products
`are available as high-performance native
`
`systems for the IBM-PC, -XT, and -AT; The
`Intel 8086/ 8088 SBC's; the Motorola 68000;
`the DEC PDP-11 and LSI-II , the RCA 1802
`and 1805; plus the new, ultra-fast NCR/32
`and NC 4000.
`They also are available as OS-resident
`systems for MS-DOS, RSX/ VMS, CP/ M-86,
`and CP/ M-80.
`polyFORTH is a complete, fully in(cid:173)
`tegrated programming system. It includes
`all functions performed normally by sep(cid:173)
`arate programs: Operating system, editor,
`assembler, compiler, utilities, debugger,
`and interpreter. And these facilities are
`co-resident and share common code for
`common functions.
`Call us today ... and tomorrow your
`tobots may be dancing like
`,Baryshnikov.
`
`L
`~~~fiTC~~~~;~ay T
`
`Hermosa Beach, California 90254
`(213) 372-8493 • TELEX (RCA)275182
`
`IBM-PC is a registered Trademark of lntfuiiational Business Machines Corporation.
`
`Reader Service Number 2
`
`Moving?
`
`PLEASE NOTIFY
`US 4 WEEKS
`IN ADVANCE
`
`Name (Please Print)
`
`New Address
`
`City
`
`State/Country
`
`Zip
`
`MAIL TO:
`IEEE Service Center
`445 Hoes Lane
`Piscataway, NJ 08854
`
`ATTACH
`LABEL
`HERE
`
`• This notice of address change will apply to all
`IEEE publications to which you subscribe.
`• List new address above.
`• If you have a question about your subscription,
`place label here and clip this form to your letter.
`
`Editor-in-Chief: Bruce D. Shriver
`IBM Thomas J. Watson Research Center
`PO Box 218, Yorktown Heights, NY 10598
`Compmail +: b.shriver
`CS Net: shriver.yktvmv@ibm-sj
`Editorial Board
`Dennis R. Allison, Stanford University
`Barry W. Boehm, TRW-DSG
`Alan M. Davis, BTG, Inc.
`Richard H. Eckhouse, Moco, Inc.
`Jack Grimes, Intel Corp.
`Ted G. Lewis, Oregon State University
`Edith W . Martin, Boeing Aerospace
`William McKeeman
`Wang Institute of Graduate Studies
`Wiley R. McKinzie
`Rochester Institute of Technology
`Edward F. Miller, Software Research Associates
`John B. Munson
`International Software Systems, Inc.
`John D. Musa, Bell Telephone Laboratories
`Robert M. Poston
`Programming Environments, Inc.
`Mary Shaw, Carnegie-Mellon University
`Magazine Advisory Committee
`Dennis W. Fife (chair), Dennis R. Allison,
`Kenneth R. Anderson, James J. Farrell III
`(editor-in-chief, IEEE Micro), Lansing Hatfield
`(editor-in-chief, IEEE Computer Graphics and
`Applications), Ronald G. Hoelzeman, Stephen F.
`Lundstrom, Michael C. Mulder (editor-in-chief,
`Computer), Roy L. Russo (editor-in-chief, IEEE
`Design & Test), Paul Schneck, Bruce D. Shriver
`(editor-in-chief, IEEE Software).
`Editor and Publisher: True Seaborn
`Managing Editor: Marilyn Potes
`Assistant Editors: Carol Ray, Nancy Blackmon,
`Galen Gruman, Angela Reilly
`Contributing Editor: Ware Myers
`Art Director: Jay Simpson
`Circulation Manager: Christina Champion
`Advertising Director: Michael Koehler
`Advertising Coordinator: Sandra J. Arteaga
`
`Circulation: IEEE Software (ISSN 0740-7459) is
`published bimonthly by the IEEE Computer Socie(cid:173)
`ty: IEEE Headquarters, 345 East 47th Street, New
`York, NY 10017; IEEE Computer Society West
`Coast Office, 10662 Los Vaqueros Circle, Los
`Alamitos, CA 90720-(714) 821-8380. Annual
`subscription: $ 14.00 in addition to any IEEE group
`or society dues; nonmembers, $90.00. Single-copy
`prices: members $7.50; nonmembers $15.00. This
`journal is also available in microfiche form.
`Undelivered copies: Send to 10662 Los Vaqueros
`Circle, Los Alamitos, CA 90720.
`Postmaster: Send address changes to IEEE Soft(cid:173)
`ware, IEEE Service Center, 445 Hoes Lane,
`Piscataway, NJ 08854. Application to mail at
`second-class postage rates is pending at New York,
`NY, and at additional mailing offices.
`Copyright and reprint permissions: Abstracting is
`permitted with credit to the source. Libraries are
`permitted to photocopy beyond the limits of US
`copyright law for private use of patrons those
`post-1977 articles that carry a code at the bottom of
`the first page, provided the per-copy fee indicated
`in the code is paid through the Copyright Clearance
`Center, 29 Congress St., Salem, MA 01970. In(cid:173)
`structors are permitted to photocopy isolated ar(cid:173)
`ticles for noncommercial classroom use without fee.
`For other copying, reprint, or republication per(cid:173)
`mission, write to Editor, IEEE Software, 10662 Los
`Vaqueros Circle, Los Alamitos, CA 90720. All rights
`reserved. Copyright © 1985 by the Institute of Elec(cid:173)
`trical and Electronics Engineers, Inc.
`Editorial: Unless otherwise stated, bylined articles,
`as well as descriptions of products and services in
`New Products, Product Highlights, and other
`departments, reflect the author's or firm's opinion;
`inclusion in this publication does not necessarily
`constitute endorsement by the IEEE or the IEEE
`Computer Society.
`
`May 1985
`
`3
`
`
`
`Channel-oriented packet
`casting is a predominant
`feature of Micros, an
`operating system designed to
`explore control and
`communication techniques for
`network computers containing
`thousands of hosts.
`
`Multicast
`communication on
`Network computers
`
`Ariel J. Frank, Larry D. Wittie, and Arthur J. Bernstein,
`state University of New York at stony Brook
`
`C ommunication is essential to
`
`distributed systems, in which a
`number of hosts are arbitrarily inter(cid:173)
`connected by a network. A network
`computer, or netcomputer, 1 consists
`of a large number of hosts embedded
`in a network of interconnected broad(cid:173)
`cast buses. It has no global clock or
`shared memory. Instead, a global de(cid:173)
`centralized operating system, with
`some code resident in every host,
`unifies the hosts into a single computer
`system, providing netcomputer users
`with a powerful computing facility
`that can be accessed as a single virtual
`multiprocessor.
`The concept of grouping processes
`to achieve goals is vital to distributed
`systems. System services for a group
`help to support communication and
`coordination among its members. Dis(cid:173)
`tributed groups are often organized to
`achieve parallel processing, increase
`data availability, reduce response time,
`share resources, or increase reliability.
`Processes in a group often need to
`multicast the same message to all other
`group processes. Such messages in(cid:173)
`clude computation results, search
`bounds, state changes, votes, and up(cid:173)
`dates to replicated data. Group mem(cid:173)
`bers may need to multicast to the
`group several times during an extended
`interaction period. Such group multi(cid:173)
`cast is more effective if some under(cid:173)
`lying multicast structure exists for the
`group. However, not all multicast
`techniques are effective for group
`multicast.
`This article explores different meth(cid:173)
`ods for frequently multicasting the
`
`same information to groups of com(cid:173)
`puters within large broadcast net(cid:173)
`works. Three techniques for group
`multicast depend either on lists of all
`group members, lists of broadcast sub(cid:173)
`nets containing all members, or local
`branch tables for a tree spanning all
`member subnets. Much of the infor(cid:173)
`mation presented on group organiza(cid:173)
`tion and group multicast is based on
`the implementation of the Micros op(cid:173)
`erating system for the Stony Brook
`netcomputer.
`
`Packet-switched networks
`The fundamental unit of informa(cid:173)
`tion flow in a communication network
`is a packet. In general, packets are sent
`by hosts, or nodes, on communication ·
`channels, or links. A channel can be a
`point-to-point link or a multiaccess
`. broadcast bus such as an Ethernet.
`: Here, we consider the general model as
`a packet-switched network in which
`hosts can store packets and forward
`them on their connected channels. (In
`subsequent discussion of a specific net(cid:173)
`computer model, we ref er to broadcast
`buses as channels.) Packets are trans(cid:173)
`ported across a network in datagram
`mode, that is, with a "best effort to de(cid:173)
`liver.'' Providing reliable transport of
`multicast packets is difficult because
`each packet must be acknowledged
`from a possibly unknown number of
`destinations. Both point-to-point and
`broadcast networks are packet(cid:173)
`switched.
`A group of processes implicitly de(cid:173)
`fines a host group consisting of all
`hosts on which the processes execute.
`
`May 1985
`
`0740-7459/ 85/ 0500/ 0049$01.00 © 1985 IEEE
`
`49
`
`
`
`A host group can be designated explic(cid:173)
`itly by a list of member addresses or
`implicitly by a logical group address.
`With lists, each group member must
`maintain a list of members so it can
`multicast to them. A membership list is
`either dynamic or static, depending on
`whether it can change. If a logical ad(cid:173)
`dress is used, all group members must
`have network interfaces that will ac(cid:173)
`cept packets sent to the logical address.
`Each interface must recognize multiple
`logical addresses if its host is a member
`of several groups .
`Evaluating multicast
`techniques
`Multicast techniques can be eval(cid:173)
`uated relative to bandwidth, delay,
`state, computation, preparation,
`maintenance, failure, and scale (see
`box below right). Tables 1 and 2 rate
`six types of multicast techniques
`against these criteria for single and
`group multicast, respectively. Most of
`these techniques, which include flood(cid:173)
`ing, separate addressing, multidestina(cid:173)
`tion addressing, partite addressing,
`single-tree forwarding, and multiple(cid:173)
`tree forwarding, are adaptations of
`their broadcast counterparts. All tech(cid:173)
`niques are evaluated for a large
`multicast group.
`The five relative values per criterion
`are nil, low, medium, high, and gross.
`Low, medium, and high ratings are al(cid:173)
`ways given to some technique. The nil
`and gross ratings are used only for ex(cid:173)
`ceptional cases. A "utopian" multi(cid:173)
`cast technique would have all nil
`ratings.
`
`Flooding. A brute force technique
`for packet multicast is to broadcast
`identical packet copies on all channels.
`Each receiving host forwards copies to
`all its other cha~nels. Only group hosts
`keep a copy of the multicast packet.
`This scheme is very simple. State,
`preparation, and maintenance costs
`are negligible, and network failures
`have almost no impact. However, the
`very name of the technique reflects its
`major disadvantage: it floods the net(cid:173)
`work with packets. The delay rating is
`medium because heavy loads slow
`channel access. Bandwidth use. is ex-
`
`50
`
`tremely high, since packets are dupli(cid:173)
`cated on multiple channels. Band(cid:173)
`width waste increases with network
`size, causing a high scale rating. Such
`gross bandwidth usage is not justifi(cid:173)
`able for multicast.
`
`For flooding to be practical at all,
`packet life time must be limited to pre(cid:173)
`vent endless duplication. Solutions in(cid:173)
`clude recording packet sequence num-
`•
`bers to prevent retransmission and ·
`discarding packets after a fixed num-
`
`1
`
`Table 1. Relative rating of criteria for single multicast.
`
`Multicast
`Technique
`
`Flooding
`Separate
`addressing
`Multidestination
`addressing
`Partite
`addressing
`Single-tree
`forwarding
`Multiple-tree
`forwarding
`
`Bandwidth
`
`Multicast Criteria
`Delay
`State
`
`Gross
`High
`
`Medium
`High
`
`Medium
`
`Medium
`
`Nil
`High
`
`High
`
`Medium
`
`Medium
`
`Medium
`
`Low
`
`Low
`
`Medium
`
`Low
`
`Low
`
`Gross
`
`Computation
`
`Medium
`Low
`
`High
`
`Low
`
`Low
`
`Low
`
`Criteria for evaluating multic~st techniques
`For a single multicast,
`• Bandwidth - The communication cost of the packet headers for a
`single multicast. It is the sum of the number of packets sent over all chan(cid:173)
`nels times the average size of their packet headers.
`• Delay - The time from the start of the multicast until the last packet
`copy is delivered. Packet delay per channel is assumed to be uniform.
`Because packet copies are sent in parallel, delay is a maximum, not a sum.
`Techniques that minimize delay .tend to maximize bandwidth and vie~
`versa.
`• State - The summed cost of storing the information that allows mem(cid:173)
`bers to multicast to the group. It can include logical identifiers, lists of
`member addresses, or forwa~.ding sublists forming previously built
`multicast structures. The state information should be bounded.
`• Computation - The processing cost fora single multicast. It includes
`calculation of intermediate destinations and update of multicast state in(cid:173)
`formation.
`
`For group multicast,
`• Preparation - The initial cost of distributing multicast information to
`all members. It may include building a structure to lower average cost per
`multicast.
`
`• Maintenance - The cost of adapting multicast information as mem(cid:173)
`bers join and leave the group.
`
`• Failure - The cost of recovering from the failure of a network host or
`channel. Failures may require routing around failed components or repair(cid:173)
`1
`ing multicast structures.
`• Scale - The sensitivity of a multicast technique both to larger groups
`in a fixed-size network and to fixed-sized groups in a distributed system of
`~ncreasing size. Scale should be at most proportional to the increase in
`size.
`
`IEEE SOFTWARE
`
`
`
`ber ofrelays. However, the bandwidth
`is always high.
`Separate addressing. An obvious
`technique for multicast is to send a sep(cid:173)
`arately addressed packet to each des(cid:173)
`tination. Each member maintains a
`
`copy of the entire group membership
`list, which is acquired during group
`preparation. A large group has a large
`membership list, so the state and scale
`ratings are high. However, network
`failure has little effect.
`
`Table 2. Relative rating of criteria for group multicast.
`
`Multicast
`Technique
`
`Flooding
`Separate
`addressing
`Multidestination
`addressing
`Partite
`addressing
`Single-tree
`forwarding
`Multiple-tree
`forwarding
`
`Multicast Criteria
`Preparation Maintenance
`Failure
`
`Nil
`Medium
`
`Nil
`Medium
`
`Medium
`
`Medium
`
`Low
`
`Low
`
`High
`
`Scale
`
`High
`High
`
`High
`
`Medium
`
`Nil
`Low
`
`Low
`
`Low
`
`Medium
`
`Medium
`
`Low
`
`Gross
`
`High
`
`High
`
`Gross
`
`The major disadvantages of this
`technique are its high bandwidth and
`high delay. Several copies differing
`only in their destination addresses are
`often sent on the same channel. The
`multicasting host sends packet copies
`sequentially, so delay can be high. This
`technique is suitable for groups with
`few destination hosts.
`Multidestination addressing. In this
`multicasting technique, a few multiply
`addressed packets are sent for each
`multicast. Each packet header includes
`a subset of the destination addresses.
`When a packet arrives at any host, its
`destination addresses are apportioned
`among multiple copies. Destinations
`with the same route share the same
`copy. Packet copies are forwarded to
`all destinations.
`Multidestination addressing is hard
`to support because of the need for var(cid:173)
`iable-sized packet headers. Because of
`address apportioning, this technique
`
`Broadcast/ multicast communication
`Processes in distributed systems communicate by ex(cid:173)
`changing messages. Frequently, the same message
`must be communicated to a set of processes. Such
`messages include system status changes, announce(cid:173)
`ments, and queries. Existing communication standards,
`such as the open system interconnection reference
`model , 1 support message communication only to a
`single destination. Single-destination communication is
`unicast, the delivery of a packet to one destination ad(cid:173)
`dress. It can be costly to unicast separate copies of the
`same message to a set of processes. Advanced distrib·
`uted systems provide special support for communica(cid:173)
`tion to multiple destinations. 2,3 Broadcast is the delivery
`of a packet to all possible destination addresses, while
`multicast is the delivery of a packet to some specified
`subset of the possible destinations. Broadcast and
`unicast are special cases of multicast.
`Initial research on techniques for broadcast and
`multicast centered on point-to-point computer networks
`and interconnected networks, or internets, especially for
`Arpanet. 4,5 Dalal 6 did the fundamental work on broad(cid:173)
`cast techniques in point-to-point networks. Wall 7 ex(cid:173)
`tended Dalal's broadcast research and investigated
`group multicast techniques. Broadcast and multicast
`structures that have been investigated include "low(cid:173)
`cost" minimum spanning trees and "low-delay" shortest
`path trees.
`Most recent research has been done on broadcast net(cid:173)
`works and internets. Ethernet 8 networks support high(cid:173)
`speed (10M-bps) packet switching in locally distributed
`environments. The Ethernet architecture directly sup(cid:173)
`ports broadcast and multicast packet transmission by
`
`~
`
`allowing logical addresses to be set in the hardware in(cid:173)
`terface. The few distributed systems supporting group
`multicast do so on a single Ethernet. 9 Ethernets have
`been interconnected to form the Pup intern~t architec(cid:173)
`ture, 2 providing a rich testbed for broadcasting research.
`Pup has had a major influence on the internet transport
`protocols for the Xerox network systems architecture. 3
`References
`1. A. S. Tanenbaum, Computer Networks, Prentice-Hall,
`Englewood Cliffs, N. J., 1981.
`2. D. R. Boggs et al. "Pup: An Internetwork Architecture,"
`IEEE Trans. Communication, Vol COM-28, No. 4, Apr. 1980,
`:'
`pp. 612-624.
`3. Y. K. Dalal, "Use of Multiple Networks in the Xerox Network
`System," Computer, Vol. 15, No. 10, Oct. 1982, pp. 82-92.
`4. S. M. Abraham and Y. K. Dalal, "Techniques for Decentral(cid:173)
`ized Management of Distributed Systems, " Proc. Comp(cid:173)
`con Spring, Feb. 1980, pp. 430-437.
`5. L. Aguilar, " Datagram Routing for Internet Multicasting,"
`ACM Sigcomm 84 Computer Communications Review,
`Vol. 14, No. 2, June 1984, pp. 58-63.
`6. Y. K. Dalal and R. M. Metcalfe, "Reverse Path Forwarding of
`Broadcast Packets," Comm. ACM, Vol. 21 , No. 12, Dec.
`1978, pp. 1040-1048.
`7. D. W. Wall, "Selective Broadcast in Packet-Switched Net(cid:173)
`works," Proc. Sixth Berkeley Workshop Distributed Data
`Management and Computer Networks, Feb. 1982, pp.
`239-258.
`8. J. F. Shoch et al., " Evolution of the Ethernet Local Com(cid:173)
`puter Network," Computer, Vol. 15, No. 8, Aug. 1982, pp.
`10-27.
`9. D. R. Cheriton and W. Zwaenepoel, "One-to-many Inter(cid:173)
`process Communication in the V-system," ACM Sigcomm
`84 Computer Communications Review, Vol. 14, No. 2, June
`1984, p. 64.
`
`May 1985
`
`51
`
`
`
`15
`
`16
`
`17
`
`18
`
`5
`
`6
`
`7
`
`25
`
`26
`
`27
`
`28
`
`I
`I
`I
`I
`Channel
`
`8
`1
`A
`I
`I
`I
`Physical
`channel
`identifier
`
`2
`
`I
`I
`I
`I
`I
`I
`I
`I
`
`Front-end
`interface
`
`I
`3 ,'
`I
`I
`Host
`
`35
`
`37
`
`45
`
`46
`
`47
`
`38 ~
`
`48
`
`I
`I
`I
`
`I
`I
`I
`I
`I
`
`4
`
`Socket
`Physical
`host
`identifier
`
`Figure 1. A sample netcomputer.
`
`has a high computation rating and a
`medium delay rating. A minimal num(cid:173)
`ber of packets are sent, but bandwidth
`use is medium because of their large
`headers. The state, failure, and scale
`ratings equal those for separate ad(cid:173)
`dressing.
`
`Partite addressing. This method is a
`combination of the separate and mul(cid:173)
`tidestination addressing techniques.
`Host destinations are partitioned by
`some common addressing locality,
`say, subnets or channels. Separate
`packets are sent to each partition for
`final delivery to all local hosts. During
`group preparation, each member re(cid:173)
`ceives a copy of the partition list. This
`technique is especially useful for
`broadcast internets (multiple Ether(cid:173)
`nets), with hosts partitioned by their
`channels.
`
`52
`
`This technique is highly resilient
`against failures. It has a medium state
`rating, since all members list only the
`channel destinations. It uses medium
`bandwidth, since several copies to dif(cid:173)
`ferent channels may be sent on the
`same initial channels. The delay is only
`medium because the multicasting host
`builds and sends fewer packets than in
`separate addressing. Adding hosts to
`the group may not increase the number
`of channel destinations, so the scale
`rating is medium. This technique is
`suitable for a group that resides on a
`small number of channels, even if
`there are many hosts.
`
`Single- and multiple-tree forward(cid:173)
`ing. In this technique, a spanning tree
`is built for the hypothetical graph of
`network hosts connected by channels,
`
`and packet copies are forwarded along
`the branches. Each host member main(cid:173)
`tains and uses an image of only the local
`branches, simplifying the computa(cid:173)
`tions for forwarding. Three main types
`of tree structures have been investi(cid:173)
`gated: shortest path, minimum span(cid:173)
`ning, and centered.
`
`A shortest path tree is one with the
`shortest possible path from the root
`host to any other tree host. Multiple
`shortest path trees 2 minimize both
`delay and bandwidth but have a gross
`state rating, since each member has a
`separate tree. The preparation rating is
`gross for multiple trees because they
`are difficult to build in a distributed
`manner. Maintenance involves exist(cid:173)
`ing trees, so its rating is only high. Tree
`table space is proportional to the
`square of group size, so the scale rating
`is gross.
`
`Another technique, called reverse
`path forwarding, 2 simulates multiple
`trees without actually maintaining them
`by using routing tables and two addi(cid:173)
`tional lists. 3 However, some packets
`may not be delivered if routing tables
`change during forwarding.
`
`A minimum spanning tree 2 has the
`smallest total branch cost of all span(cid:173)
`ning trees. A centered tree 3 is a shortest
`path tree rooted at a host ''in the
`center" of the group. A single tree
`minimizes both bandwidth and state
`but has a medium delay rating because
`not all paths may be minimal. How(cid:173)
`ever, a centered tree has only margin(cid:173)
`ally better ratings than a minimum
`spanning tree. The preparation in
`building a single tree is high, but main(cid:173)
`tenance is medium and scale is low. Al(cid:173)
`though sensitive to failures, trees form
`an excellent structure for group multi(cid:173)
`cast. In general, trees have the lowest
`bandwidth and delay ratings. They are
`particularly suitable for point-to-point
`networks.
`
`The major problem with host tree
`forwarding on broadcast internets is
`that separate packet copies are sent to
`each host, not to each channel. In ad(cid:173)
`dition, a spanning tree specifies an in(cid:173)
`termediate host between member hosts.
`The technique of using alternate paths
`
`IEEE SOFTWARE
`
`
`
`between members would be less failure(cid:173)
`sensitive.
`A good solution to both problems is
`to span a logical tree over channels and
`not over hosts. Packets can be for(cid:173)
`warded to channels and then to hosts,
`as in partite addressing. The tree spans
`only member channels; physical paths
`between them are decided by runtime
`routing. (The use of channel-based
`trees is further described later.)
`
`The netcomputer
`framework
`A netcomputer is physically distri(cid:173)
`buted like a local broadcast internet
`but provides its users a single virtual
`computer like a multiprocessor. It has
`three types of communication re(cid:173)
`sources: channels, hosts, and sockets
`(Figure 1). A multiaccess broadcast
`bus with short transmission delays is
`referred to as a channel; a host is a pro(cid:173)
`cessing node; and a socket within a
`host is a process address. Each re(cid:173)
`source type has physical and logical
`addresses. A physical identifier refers
`to a specific instance of a communica(cid:173)
`tion resource; a logical identifier refers
`to a group of them.
`A broadcast bus supports the logical
`addressing of multiple hosts. Netcom(cid:173)
`puter addressing conventions, based
`on those for internets in Xerox network
`systems, 4 extend logical host address(cid:173)
`ing over the entire netcomputer, and
`provide for logical channel addressing.
`Each netcomputer address consists of
`channel, host, and socket identifiers.
`Since variable-sized packet headers
`can cause severe buffering problems,
`each packet header is assumed to have
`only one destination address.
`
`Channels. A netcomputer can con(cid:173)
`tain a large number of channels, such
`as Ethernets. 5 Each channel has a
`unique, 32-bit physical channel identi(cid:173)
`fier in an unambiguous flat addressing
`scheme. Physical channel identifiers
`are used mainly as designators for net(cid:173)
`computer routing. The physical chan(cid:173)
`nel identifiers in Figure 1 are arbitrary.
`A channel interfaced to a host via a
`transmission front end is said to be
`directly connected to that host; other(cid:173)
`wise it is distant. In Figure 1, channel 8
`
`May 1985
`
`is directly connected to host 18, and
`channel 3 is distant. A channel can be
`directly connected to many hosts; a
`host can have several directly con(cid:173)
`nected channels. Figure 1 depicts a grid
`netcomputer in which each host has at
`most two directly connected channels,
`nominally vertical and horizontal.
`
`Hosts. A netcomputer can contain a
`large number of hosts. Each host has a
`unique, 48-bit physical host identifier.
`Netcomputer physical host identifiers
`are absolute and implement a flat ad(cid:173)
`dressing scheme that is independent of
`the channel addressing scheme and can
`
`A netcomputer is physically
`distributed like a broadcast
`internet, but has a single
`virtual computer like a
`multiprocessor.
`
`be used in the generation of other
`unique identifiers. Each host gives its
`physical identifier to all its attached
`front-ends as its host address. In Figure
`1, each physical host identifier is conve(cid:173)
`niently chosen as the concatenation of
`its two physical channel identifiers.
`
`Sockets. A socket represents a bi(cid:173)
`directional port within a host that
`serves as a source and destination for
`packets. Packets can be both delivered
`to and transmitted from a socket. A
`host can support a large number of
`sockets. Each host can receive packets
`addressed to its directly connected
`channels.
`Each socket has a 32-bit physical
`socket identifier that is globally unique
`but ephemeral. It is generated by in(cid:173)
`corpora ting the unique part of a
`physical host identifier. For a 48-bit
`Ethernet address, this part is 20 bits
`long. A logical socket identifier, which
`designates a group of one or more
`sockets in one or more hosts, is needed
`for all group members to receive a
`multicast packet on the same socket
`address.
`
`Packet casting
`Since a netcomputer can be con(cid:173)
`figured with an arbitrarily large num(cid:173)
`ber of channels and hosts, it can be
`quite large and the packet transport
`mechanisms quite complex. Packets
`are transported by a network level
`functionally comparable to the OSI
`network layer, 6 but without the strict
`interface boundaries. Packets are
`transported across a netcomputer as
`datagrams.
`Packet transport on a netcomputer
`is oriented toward channels. Packets
`are transmitted on, switched among,
`and routed to channels. Routing to
`channels and not to hosts provides an
`order of magnitude reduction in the
`routing information required on each
`host. 4 Another order of magnitude
`reduction can be achieved by partition(cid:173)
`ing routing information among all
`hosts on each physical channel.
`A single channel provides an excel(cid:173)
`lent architecture for multidestination
`communication but has limited exten(cid:173)
`sibility. Providing multicast communi(cid:173)
`cation on a netcomputer is harder,
`since packet casting on distant chan(cid:173)
`nels requires support. Requirements
`for packet casting on a , netcomputer
`may be as simple as packet transmis(cid:173)
`sion to a single destination host on a
`