throbber
SRC00034161
`
`Programmability with increasedperformance? New strategies to
`attain this goal include two approaches to data flow architecture:
`data flow multiprocessors and the cell block architecture.
`
`Data Flow Supercomputers
`
`Jack B. Dennis
`MIT Laboratory for Computer Science
`
`The architects of supercomputers must meet three
`challenges if the next generation of machines is to find
`productive large-scale application to the important prob-
`lems of computational physics. First, they must achieve
`high performance at acceptable cost. Instruction execu-
`tion rates of a billion floating-point operations each sec-
`ond are in demand, whereas current architectures require
`intricate programming to attain a fraction of their poten-
`tial, at best around one tenth of the goal. Brute force ap-
`proaches to increase the speed of conventional architec-
`tures have reached their limit and fail to take advantage of
`the major recent advances in semiconductor device tech-
`nology. Second, they must exploit the potential of LSI
`technology. Novel architectures are needed which use
`large numbers but only a few different types of parts, each
`with a high logic-to-pin ratio. In a supercomputer, most
`of these parts must be productive most of the time; hence
`the need to exploit concurrency of computation on a mas-
`sive scale. Third, it must be possible to program super-
`computers to exploit their performance potential. This
`has proven to be an enormous problem, even in the case of
`computations for which reasonably straightforward For-
`tran programs exist. Thus present supercomputer archi-
`tectures have exacerbated rather than resolved the soft-
`ware crisis.
`It appears that the objectives of improving program-
`mability and increasing performance are in conflict, and
`new approaches are necessary. However, any major de-
`parture from conventional architectures based on sequen-
`tial program execution requires that the whole process of
`program design, structure, and compilation be redone
`along new lines. One architecture under consideration is a
`multiprocessor machine made of hundreds of intercom-
`municating microcomputer processing elements. This
`architecture has attracted wide interest, but has many
`drawbacks; even if the processing elements had full float-
`
`ing-point capability and ran at a million instructions per
`second, at least one thousand would be required to attain
`a billion instructions per second performance. For such a
`number of processing elements there is no known way of
`permitting access to a shared memory without severe
`performance degradation. Similarly, no known way of
`arranging conventional microprocessors for synchroniza-
`tion or message passing allows efficient operation while
`exploiting fine grain parallelism in an application. And
`finally, there is no programming language or methodo-
`logy that supports mapping application codes onto such a
`multiprocessor in a way that achieves high performance.
`Language-based computer design can ensure the pro-
`grammability of a radical architecture. In a language-
`based design the computer is a hardware interpreter for a
`specific base language, and programs to be run on the sys-
`tem must be expressed in this language. I Because future
`supercomputers must support massive concurrency to
`achieve a significant increase in performance, a base lan-
`guage for supercomputers must allow expression of con-
`currency of program execution on a large scale. Since con-
`ventional languages such as Fortran are based on a global
`state model of computer operation, these languages are
`unsuitable for the next generation of supercomputers and
`will eventually be abandoned for large-scale scientific
`computation. At present, functional or applicative pro-
`gramming languages and data flow models of computa-
`tion are the only known foundation appropriate for a su-
`percomputer base language. Two programming lan-
`guages have been designed recently in response to the
`need for an applicative programming language suitable
`for scientific numerical computation: ID, developed at Ir-
`vine,2 and Val, designed at MIT.3'4
`Data flow architectures offer a possible solution to the
`problem of efficiently exploiting concurrency of compu-
`tation on a large scale, and they are compatible with
`
`48
`
`0018-9162/80/1100-0048$00.75 g 1980 IEEE
`
`COMPUTER
`
`Patent Owner Saint Regis Mohawk Tribe
`Ex. 2048, p. 1
`
`

`

`SRC00034162
`
`modern concepts of program structure. Therefore, they
`should not suffer so much from the difficulties of pro-
`gramming that *have hampered other approaches to
`highly parallel computation.
`The data flow concept is a fundamentally different way
`of looking at instruction execution in machine-level pro-
`grams-an alternative to sequential instruction execu-
`tion. In a data flow computer, an instruction is ready for
`execution when its operands have arrived. There is no
`concept of control flow, and data flow computers do not
`have program location counters. A consequence of data-
`activated instruction execution is that many instructions
`of a data flow program may be available for execution at
`once. Thus, highly concurrent computation is a natural
`consequence of the data flow concept.
`The idea of data-driven computation is old,5'6 but only
`in recent years have architectural schemes with attractive
`anticipated performance and the capability of supporting
`a general level of user language been developed. Work on
`data-driven concepts of program structure and on the
`design of practical data-driven computers is now in prog-
`ress in at least a dozen laboratories in the US and Europe.
`Several processors with data-driven instruction execution
`have been built, and more hardware projects are being
`planned. Most of this work on architectural concepts for
`data flow computation is based on a program representa-
`tion known as data flow program graphs7 which evolved
`from work of Rodriguez,8 Adams,9 and Karp and
`Miller.10 In fact, data flow computers are a form of
`language-based architecture in which program graphs are
`the base language. As shown in Figure 1, data flow pro-
`gram graphs serve as a formally specified interface be-
`tween system architecture on one hand and user program-
`ming language on the other. The architect's task is to
`define and realize a computer system that faithfully im-
`plements the formal behavior of program graphs; the
`language implementer's task is to translate source
`language programs into their equivalent as program
`graphs.
`
`The techniques used to translate source language pro-
`grams into data flow graphs I I are similar to the methods
`used in conventional optimizing compilers to analyze the
`paths of data dependency in source programs. High-level
`programming languages for data flow computation should
`be designed so it is easy for the translator to identify data
`dependence and generate program graphs that expose
`parallelism. The primary sources of difficulty are un-
`restricted transfer of control and the "side effects"
`resulting from assignment to a global variable or input ar-
`guments of a procedure. Removal of these sources of dif-
`ficulty not only makes concurrency easy to identify, it
`also improves program structure. Programs are more
`modular, and are easier to understand and verify. The im-
`plications of data flow for language designers are dis-
`cussed by Ackerman. 12
`This article presents two architectures from the variety
`of schemes devised to support computations expressed as
`data flow graphs. First we explain data flow graphs by ex-
`amples, and show how they are represented as collections
`of activity templates. Next we describe the basic instruc-
`tion-handling mechanism used in most current projects to
`
`November 1980
`
`build prototype data flow systems. Then we develop the
`two contrasting architectures and discuss the reasons for
`their differences-in particular the different approaches
`to communicating information between parts of a data
`flow machine.
`
`Data flow programs
`
`A data flow program graph is made up of actors con-
`nected by arcs. One kind of actor is the operator shown in
`Figure 2, drawn as a circle with a function symbol written
`inside-in this case +, indicating addition. An operator
`also has input arcs and output arcs which carry tokens
`bearing values. The arcs define paths over which values
`from one actor are conveyed by tokens to other actors.
`Tokens are placed on and removed from the arcs of a pro-
`gram graph according to firing rules, which are illustrated
`for an operator in Figure 3. To be enabled, tokens must be
`present on each input arc, and there must be no token on
`any output arc of the actor. Any enabled actor may be
`fired. In the case of an operator, this means removing one
`token from each input arc, applying the specified func-
`tion to the values carried by those tokens, and placing
`tokens labeled with the result value on the output arcs.
`
`Figure 1. Program graphs as a base language.
`
`Figure 2. Data flow actor.
`
`Figure 3. Firing rule: (a) before; (b) after.
`
`49
`
`Patent Owner Saint Regis Mohawk Tribe
`Ex. 2048, p. 2
`
`

`

`SRC00034163
`
`Figure 4. Interconnection of operators.
`
`Figure 5. An activity template.
`
`Operators may be connected as shown in Figure 4 to
`form program graphs. Here, presenting tokens bearing
`values for x and y at the two inputs will enable computa-
`tion of the value
`
`z = (x+y) * (x-y)
`
`by the program graph, placing a token carrying the result
`value on output arc z.
`Another representation for data flow programs-one
`much closer to the machine language used in prototype
`data flow computers-is useful in understanding the
`working of these machines. In this scheme, a data flow
`program is a collection of activity templates, each cor-
`responding to one or more actors of a data flow program
`graph. An activity template corresponding to the plus
`operator (Figure 2) is shown in Figure 5. There are four
`fields: an operation code specifying the operation to be
`performed; two receivers, which are places waiting to be
`filled in with operand values; and destination fields (in
`this case one), which specify what is to be done with the
`result of the operation on the operands.
`An instruction of a data flow program is the fixed por-
`tion of an activity template. It consists of the operation
`code and the destinations; that is,
`
`instruction:
`< opcode, destinations>
`
`Figure 6 shows how activity templates are joined to repre-
`sent a program graph, specifically the composition of op-
`erators in Figure 4. Each destination field specifies a tar-
`get receiver by giving the address of some activity tem-
`plate and an input integer specifying which receiver of the
`template is the target; that is,
`
`destination:
`<address, input>
`
`Figure 6. Configuration of activity templates for the pro-
`gram graph of Figure 4.
`
`Program structures for conditionals and iteration are
`illustrated in Figures 7 and 8. These use two new data flow
`actors, switch and merge, which control the routing ofda-
`ta values. The switch actor sends a data input to its T or F
`output to match a true or false boolean control input. The
`merge actor forwards a data value from its T or F input ac-
`cording to its boolean input value. The conditional pro-
`gram graph and implementation in Figure 7 represent
`computation of
`
`y: =(IFx>3 THENx+2 ELSEx-I)*4
`
`and the program graph and implementation in Figure 8
`represent the iterative computation
`
`WHILE x>O DO = x-3
`
`Execution of a machine program consisting of activity
`templates is viewed as follows. The contents of a template
`activated by the presence of an operand value in each re-
`ceiver take the form
`
`operation packet:
`<opcode, operands, destinations>
`
`Such a packet specifies one result packet of the form
`
`result packet:
`<value, destination>
`
`for each destination field of the template. Generation of a
`result packet, in turn, causes the value to be placed in the
`receiver designated by its destination field.
`Note that this view of data flow computation does not
`explicitly honor the rule of program graphs that tokens
`must be absent from the output arcs of an actor for it to
`fire. Yet there are situations where it is attractive to use a
`program graph in pipelined fashion, as illustrated in Fig-
`ure 9a. Here, one computation by the graph has produced
`the value 6 on arc z while a new computation represented
`
`COMPUTER
`
`50
`
`Patent Owner Saint Regis Mohawk Tribe
`Ex. 2048, p. 3
`
`

`

`SRC00034164
`
`r e
`
`lee
`
`3]
`
`Figure 7. A conditional schema (a) and its implementation (b).
`
`Figure 8. An iterative schema (a) and its implementation (b).
`
`Figure 9. Pipelining in a data flow program (a) and its implementation (b).
`
`November 1980
`
`51
`
`Patent Owner Saint Regis Mohawk Tribe
`Ex. 2048, p. 4
`
`

`

`SRC00034165
`
`OPERATION
`UNIT(S)
`
`INSTRUCTION
`QUEUE
`
`ACTIVITY
`STORE
`
`UPDATE
`
`by input values 5 and 3 on arcs xandy is ready to begin. To
`faithfully implement this computation, the add instruc-
`tion must not be reactivated until its previous result has
`been used by the multiply instruction. This constraint is
`enforced through use of acknowledge signals generated
`by specially marked designations (*) in an activity tem-
`plate. Acknowledge signals, in general, are sent to the
`templates that supply operand values to the activity tem-
`plate in question (Figure 9b). The enabling rule now re-
`quires that all receivers contain values, and the required
`number of acknowledge signals have been received. This
`number (if nonzero) is written adjacent to the opcode of
`an activity template.
`
`The basic mechanism
`
`The basic instruction execution mechanism used in sev-
`eral current data flow projects is illustrated in Figure 10.
`The data flow program describing the computation to be
`performed is held as a collection of activity templates in
`the activity store. Each activity template has a unique ad-
`dress which is entered in the instruction queue unit (a
`FIFO buffer store) when the instruction is ready for exe-
`cution. The fetch unit takes an instruction address from
`the instruction queue and reads the activity template from
`the activity store, forms it into an operation packet, and
`passes it on to the operation unit. The operation unit per-
`forms the operation specified by the operation code on
`the operand values, generating one result packet for each
`destinatiorn field of the operation packet. The update unit
`receives result packets and enters the values they carry in-
`to operand fields of activity templates as specified by their
`destination fields. The update unit also tests whether all
`operand and acknowledge packets required to activate
`
`Figure 10. Basic instruction execution mechanism.
`
`52
`
`the destination instruction have been received and, if so,
`enters the instruction address in the instruction queue.
`During program execution, the number of entries in the
`instruction queue measures the degree of concurrency
`present in the program. The basic mechanism of Figure 10
`can exploit this potential to a limited but significant de-
`gree: once the fetch unit has sent an operation packet off
`to the operation unit, it may immediately read another en-
`try from the instruction queue without waiting for the in-
`struction previously fetched to be completely processed.
`Thus a continuous stream of operation packets may flow
`from the fetch unit to the operation unit so long as the in-
`struction queue is not empty.
`This mechanism is aptly called a circular pipeline-ac-
`tivity controlled by the flow of information packets tra-
`verses the ring of units leftwise. A number of packets may
`be flowing simultaneously in different parts of the ring on
`behalf of different instructions in concurrent execution.
`Thus the ring operates as a pipeline system with all of its
`units actively processing packets at once. The degree of
`concurrency possible is limited by the number of units on
`the ring and the degree of pipelining within each unit. Ad-
`ditional concurrency may be exploited by splitting any
`unit in the ring into several units which can be allocated to
`concurrent activities. Ultimately, the level of concurrency
`is limited by the capacity of the data paths connecting the
`units of the ring. This basic mechanism is essentially that
`implemented in a prototype data flow processing element
`built by a group at the Texas Instruments Company.13
`The same mechanism, elaborated to handle data flow
`procedures, was described earlier by Rumbaugh, 14 and a
`new project at Manchester University uses another varia-
`tion of the same scheme.'5
`
`The data flow multiprocessor
`
`The level of concurrency exploited may be increased
`enormously by connecting many processing elements of
`the form we have described to form a data flow multipro-
`cessor system. Figure 1 la shows many processing ele-
`ments connected through a communication system, and
`Figure 1 lb shows how each processing element relates to
`the communication system. The data flow program is di-
`vided into parts which are distributed over the processing
`elements. The activity stores of the processing elements
`collectively realize a single large address space, so the ad.
`dress field of a destination may select uniquely any activi-
`ty template in the system. Each processing element sends a
`result packet through the communication network if its
`destination address specifies a nonlocal activity template,
`and to its own update unit otherwise.
`The communication network is responsible for deliver-
`ing each result packet received to the processing element
`that holds the target activity template. This network,
`called a routing network, transmits each packet arriving
`at an input port to the output specified by information
`contained in the packet. The requirements of a routing
`network for a data flow multiprocessor differ in two im-
`portant ways from those of a processor/memory switch
`for a conventional multiprocessor system. First, informa-
`tion flow in a routing network is in one direction-an im-
`
`COMPUTER
`
`Patent Owner Saint Regis Mohawk Tribe
`Ex. 2048, p. 5
`
`

`

`SRC00034166
`
`COMMUNI-
`CATION
`NETWORK
`SYSTEM:
`
`CATION
`SYSTEM
`
`ACTIVITY
`STORE
`
`mediate reply from the target unit to the originating unit is
`not required. Second, since each processing element holds
`many enabled instructions ready for processing, some
`delay can be tolerated in transmission of result packets
`without slowing down the overall rate of computation.
`The crossbar switch in conventional multiprocessor
`systems meets requirements for immediate response and
`small delay by providing for signal paths from any input
`to any output. These paths are established on request and
`maintained until a reply completes a processor/memory
`transaction. This arrangement is needlessly expensive for
`a data flow multiprocessor, and a number of alternative
`network structures have been proposed. The ring form of
`communication network is used in many computer net-
`works, and has been used by Texas Instruments to couple
`four processing elements in their prototype data flow
`computer. The drawback of the ring is that delay grows
`linearly with size, and there is a fixed bound on capacity.
`Several groups have proposed tree-structured networks
`for communicating among processing elements.16"17"18
`Here, the drawback is that traffic density at the root node
`may be unacceptably high. Advantages ofthe tree are that
`the worst case distance between leaves grows only as log2 N
`(for a binary tree), and many pairs of nodes are connected
`by short paths.
`The packet routing network shown in Figure 12 is a
`structure currently attracting much attention. A routing
`network with N input and N output ports may be as-
`sembled from (N/2) log2(N) units, each of which is a 2 x 2
`router. A 2 x 2 router receives packets at two input ports
`and transmits each received packet at one of its output
`ports according to an address bit contained in the packet.
`Packets are handled first come, first served, and both out-
`put ports may be active concurrently. Delay through an
`Nx N network increases as 1og2 N, and capacity rises
`nearly linearly with N. This form of routing network is de-
`scribed in Leung'9 and Tripathi and Lipovski.20 Several
`related structures have been analyzed for capacity and de-
`lay.21
`
`The cell block architecture
`
`In a data flow multiprocessor (Figure 11), we noted the
`problem of partitioning the instructions of a program
`among the processing elements to concentrate communi-
`cation among instructions held in the same processing ele-
`ment. This is advantageous because the time to transport
`a result packet to a nonlocal processor through the rout-
`ing network will be longer (perhaps much longer) than the
`time to forward a result locally.
`At MIT, an architecture has been proposed in response
`to an opposing view: each instruction is equally accessible
`to result packets generated by any other instruction,
`regardless of where they reside in the machine.22'23 The
`structure of this machine is shown in Figure 13. The heart
`of this architecture is a large set of instruction cells, each
`of which holds one activity template of a data flow pro-
`gram. Result packets arrive at instruction cells from the
`distribution network. Each instruction cell sends an op-
`eration packet to the arbitration network when all oper-
`ands and signals have been received. The function of the
`
`November 1980
`
`Figure 11. Data flow multiprocessor: (a) connection of many process-
`ing elements through a communication system; (b) relationship of
`each PE to the communication system.
`
`Figure 12. Routing network structure.
`
`53
`
`Patent Owner Saint Regis Mohawk Tribe
`Ex. 2048, p. 6
`
`

`

`SRC00034167
`
`DISTRI-
`BUTION
`NETWORK
`
`INSTRUCTION
`fies
`
`OPERATION
`SECTION
`
`DISTRI-
`BUTION
`NETWORK
`
`CELL BLOCKS
`
`ARBI-
`TRATION
`NETWORK
`
`ARBI-
`TRATION
`NETWORK
`
`UPDATE
`
`ACTIVITY
`STORE
`
`Figure 13. Genesis of the cell block architecture.
`
`Figure 14. Practical form of the cell block architecture.
`
`Figure 15. Cell block implementation.
`
`operation section is to execute instructions and to for-
`ward result packets to target instructions by way of the
`distribution network.
`The design in Figure 13 is impractical if the instruction
`cells are fabricated as individual physical units, since the
`number of devices and interconnections would be enor-
`mous. A more attractive structure is obtained if the in-
`struction cells are grouped into blocks and each block re-
`alized as a single device. Such an instruction cell block has
`a single input port for result packets and a single output
`port for operation packets. Thus one cell block unit re-
`places many instruction cells and the associated portion
`of the distribution network. Moreover, a byte-serial for-
`mat for result and operation packets further reduces the
`number of interconnections between cell blocks and other
`units.
`
`COMPUTER
`
`Patent Owner Saint Regis Mohawk Tribe
`Ex. 2048, p. 7
`
`

`

`SRC00034168
`
`. . . . . . . .
`
`. . . . . .
`
`. . . . . .
`
`. . . . . . . . .
`
`. . . . . .
`
`. . .
`
`. . . .
`
`1,895
`2,195
`
`TRANSNET corporation
`201-688-7800
`T W X 7 1 0 - 9 8 5 - 5 4 8 5
`
`P C E
`t a r y
`UNION, N.J. 07083
`
`.
`
`The resulting structure is shown in Figure 14. Here, sev-
`eral cell blocks are served by a shared group of functional
`, Pk. The arbitration network in each section
`units Pi, .
`.
`of the machine passes each operation packet to the appro-
`priate functional unit according to its opcode. The num-
`ber of functional unit types in such a machine is likely to
`be small (four, for example), or just one universal func-
`tional unit type might be provided, in which case the arbi-
`tration network becomes trivial.
`The relationship between the cell block architecture
`and the basic mechanism described earlier becomes clear
`when one considers how a cell block unit would be con-
`structed. As shown in Figure 15, a cell block would in-
`clude storage for activity templates, a buffer store for ad-
`dresses of enabled instructions, and control units to re-
`ceive result packets and transmit operation packets.
`These control units are functionally equivalent to the
`fetch and update units of the basic mechanism. The cell
`block differs from the basic data flow processing element
`in that the cell block contains no functional units, and
`there is no shortcut for result packets destined for succes-
`sor instructions held in the same cell block.
`
`Discussion and conclusions
`
`In the cell block architecture, communication of a re-
`sult packet from one instruction to its successor is equally
`easy (or equally difficult, depending on your point of
`view) regardless of how the two instructions are placed
`within the entire activity store of the machine. Thus the
`programmer need not be concerned that his program
`might run slowly due to an unfortunate distribution of in-
`structions in the activity store address space. In fact, a
`random allocation of instructions may prove to be ade-
`quate.
`In the data flow multiprocessor, communication be-
`tween two instructions is much quicker if these instruc-
`tions are allocated to the same processing element. Thus a
`program may run much faster if its instructions are clus-
`tered to minimize communication traffic between clusters
`and each cluster is allocated to one processing element.
`Since it will be handling significantly less packet traffic,
`the communication network of the data flow multipro-
`cessor will be simpler and less expensive than the distribu-
`tion network in the cell block architecture. Whether the
`cost reduction justifies the additional programming ef-
`fort is a matter of debate, contingent on the area of appli-
`cation, the technology of fabrication, and the time frame
`under consideration.
`Although the routing networks in the two forms of data
`flow processor have a much more favorable growth of
`logic complexity (N log N) with increasing size than the
`switching networks of conventional multiprocessor sys-
`tems, their growth is still more than linear. Moreover, in
`all suggested physical structures for N x N routing net-
`works, the complexity as measured- by total wire length
`grows as O(N2). This fact shows that interconnection
`complexity still places limits on the size of practical multi-
`unit systems which support universal intercommunica-
`tion. If we need still larger systems, it appears we must set-
`tle for arrangements of units that only support com-
`
`November 1980
`
`Reader Service Number 9
`
`munication with immediate neighbors.
`The advant-age data flow architectures have over other
`approaches to high-performance computation is that the
`scheduling and synchronization of concurrent activities
`are built in at the hardware level, enabling each instruc-
`tion execution to be treated as an independent concurrent
`action. This allows efficient fine grain parallelism, which
`is precluded when the synchronization and scheduling
`functions are realized in software or microcode. Further-
`more, there are well-defined rules for translating high-
`level programs into data flow machine code.
`What are the prospects for data flow supercomputers?
`Machines based on either of the two architectures pre-
`sented in this paper could be built today. A machine hav-
`ing up to 512 processing elements or cell blocks seems fea-
`sible. For example, a 4 x 4 router for packets, each sent as
`a series of 8-bit bytes, could be fabricated as a 100-pin LSI
`device, and fewer than one thousand of these devices
`could interconnect 512 processing elements or cell blocks.
`If each processing unit could operate at two million in-
`structions per second, the goal of a billion instructions per
`second would be achieved.
`Yet there are problems to be solved and issues to be ad-
`dressed. It is difficult to see how data flow computers
`could support programs written in Fortran without re-
`strictions on and careful tailoring of the code. Study is
`just beginning on applicative languages like Val and
`ID.24,25 These promise solutions to the problems of map-
`
`1
`
`IIW
`IRlbX1tl
`ly;U-1%
`PURCHASE 12-24 MONTH FULL1 36 MONTH
`OWNERSHIP PLAN I LEASE PLAN
`PLAN
`PURCHASE
`PER MONTH
`36 MOS.
`24 MOS.
`12 MOS.
`PRICE
`DESCRIPTION
`$ 61
`......... $1,695
`LA36 DECwriter
`$162
`$ 90
`LA34 DECwriter IV ........: .1,095
`40
`59
`105
`1,295
`LA34 DECwriter IV Forms CtrI.
`124
`47
`69
`239
`90
`2,495
`LA120 DECwriter IlIl KSR
`140
`LA180 DECprinter I. 2,095
`75
`200
`117
`VT100 CRT DECscope
`68
`1,895
`101
`182
`VT132 CRT DECscope
`83
`122
`220
`2,295
`DT80/1 DATAMEDIA CRT
`
`1 995
`
`191
`
`106
`
`72
`
`...
`
`1,595
`153
`T1745 Portable Terminal
`2,595
`249
`T1765 Bubble Memory Terminal
`T1810 RO Printer .1,895 182
`T1820 KSR Printer .2,195 210
`1,595
`153
`T1825 KSR Printer
`
`85
`146
`101
`117
`85
`
`57
`94
`68
`79
`57
`
`.
`
`.
`
`84
`139
`210
`316
`278
`
`91
`
`47
`78
`117
`176
`155
`
`51
`
`32
`53
`79
`119
`105
`
`34
`43
`47
`54
`96
`
`ADM3A CRT Terminal
`875
`ADM31 CRT Terminal . 1,450
`!
`2,195
`ADM42 CRT Terminal
`3,295
`QUME Letter Quality KSR
`QUME Letter Quality RO
`2,895
`HAZELTINE 1420 CRT
`945
`1,195
`64
`115
`HAZELTINE 1500 CRT
`1,295
`69
`124
`HAZELTINE 1552 CRT
`1,495
`80
`144
`Hewlett-Packard 2621A CRT
`2,650
`254
`Hewlett-Packard 2621P CRT
`142
`FULL OWNERSHIP AFTER 12 OR 24 MONTHS
`10% PURCHASE OPTION AFTER 36 MONTHS
`ACCESSORIES AND PERIPHERAL EQUIPMENT
`ACOUSTIC COUPLERS * MODEMS * THERMAL PAPER
`RIBBONS * INTERFACE MODULES * FLOPPY DISK UNITS
`PROMPT DELIVERY * EFFICIENT SERVICE
`
`Patent Owner Saint Regis Mohawk Tribe
`Ex. 2048, p. 8
`
`

`

`SRC00034169
`
`56
`
`ping high-level programs into machine-level programs
`that effectively utilize machine resources, but much re-
`mains to be done. Creative research is needed to handle
`data structures in a manner consistent with principles of
`data flow computation. These are among the problems
`under study in our data flow project at MIT. E
`
`Acknowledgment
`
`This paper is based on research supported by the
`Lawrence Livermore National Laboratory of the Univer-
`sity of California under contract 8545403.
`
`References
`
`I .
`
`2.
`
`4.
`
`5.
`
`7.
`
`8.
`
`9.
`
`J.B. Dennis, "On the Design and Specification of a Com-
`mon Base Language," Proc. Symp. Computers and Auto-
`mata, Polytechnic Press, Polytechnic Institute of Brook-
`lyn, Apr. 1971, pp. 47-74.
`Arvind, K.P. Gostelow, and W. Plouffe, An Asynchro-
`nous Programming Language and Computing Machine,
`Dept. of Information and Computer Science, University of
`California, Irvine, Technical Report 114a, Dec. 1978, 97
`PP.
`3. W.B. Ackerman and J.B. Dennis, VAL:A Value Oriented
`Algorithmic Language, Preliminary Reference Manual,
`Laboratory for Computer Science, MIT, Technical Report
`TR-218, June 1979, 80 pp.
`J.R. McGraw, Data Flow Computing: The VAL Lan-
`guage, submitted for publication.
`R.R. Seeber and A.B. Lindquist, "Associative Logic for
`Highly Parallel Systems," AFIPS Conf. Proc, 1963, pp.
`489-493.
`6. R.M. Shapiro, H. Saint, and D.L. Presberg, Representa-
`tion of Algorithms as Cyclic Partial Orderings, Applied
`Data Research, Wakefield, Mass., Report CA-7112-2711,
`Dec. 1971.
`J.B. Dennis, "First Version of a Data Flow Procedure
`Language," Lecture Notes in Computer Sci., Vol. 19,
`Springer-Verlag, 1974, pp. 362-376.
`J.E. Rodriguez, A Graph Model for Parallel Computa-
`tion, Laboratory for Computer Science, MIT, Technical
`Report TR-64, Sept. 1969, 120 pp.
`D.A. Adams, A Computation Model With Data Flow Se-
`quencing, Cbmputer Science Dept., School of Humanities
`and Sciences, Stanford University, Technical Report CS
`117, Dec. 1968, 130 pp.
`10. R.M. Karp and R.E. Miller, "Properties of a Model for
`Parallel
`Termination,
`Determinacy,
`Computations:
`Queueing," SIAM J. Applied Math., Vol. 14, Nov. 1966,
`pp. 1390-1411.
`J.D. Brock and L.B. Montz, "Translation and Optimiza-
`tion of Data Flow Programs," Proc. 1979 Int'l Conf. on
`Parallel Processing, Bellaire, Mich., Aug. 1979, pp. 46-54.
`12. W.B. Ackerman, "Data Flow Languages," AFIPS Conf.
`Proc., Vol. 48, 1979 NCC, New York, June 1979, pp.
`1087-1095.
`13. M. Cornish, private communication, Texas Instruments
`Corp., Austin, Tex.
`J.E. Rumbaugh, "A Data Flow Multiprocessor," IEEE
`Trans. Computers, Vol. C-26, No. 2, Feb. 1977, pp.
`138-146.
`
`11.
`
`14.
`
`15.
`
`16.
`
`17.
`
`I. Watson and J. Gurd, "A Prototype Data Flow Com-
`puter With Token Labelling," AFIPS Conf. Proc., 1979
`NCC, New York, June 1979, pp. 623-628.
`A. Davis, "A Data Flow Evaluation System Based on the
`Concept of Recursive Locality," AFIPS Conf. Proc., Vol.
`48, 1979 NCC, New York, June 1979, pp. 1079-1086.
`A. Despain and D. Patterson, "X-Tree: A Tree Structured
`Multi-Processor Computer Architecture, " Proc. Fifth An-
`nual Symp. Computer Architecture, Apr. 1978, pp.
`144-150.
`18. R.M. Keller, G. Lindstrom, and S.S. Patil, "A Loosely-
`Coupled Applicative Multi-processing System," AFIPS
`Conf. Proc., 1979 NCC, New York, June 1979, pp.
`613-622.
`C. Leung, On a Design Methodology for Packet Com-
`munication Architectures Based on a Hardware Design
`Language, submitted for publication.
`A.R. Tripathi and G.J. Lopovski, "Packet Switching in
`Banyan Networks," Proc. Sixth AnnualSymp. Computer
`Architecture, Apr. 1979, pp. 160-167.
`21. G.A. Boughton, Routing Networks in Packet Com-
`munication Architectures, MS Thesis, Dept. of Electrical
`Engineering a

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket