throbber
Declaration of Rachel J. Watters on Authentication of Publication
`
`I, Rachel J. Watters, am a librarian, and the Director of Wisconsin TechSearch
`
`(“WTS”), located at 728 State Street, Madison, Wisconsin, 53706. WTS is an
`
`interlibrary loan department at the University of Wisconsin-Madison.
`
`I have worked as
`
`a librarian at the University of Wisconsin library system since 1998.
`
`I have been
`
`employed at WTS since 2002, first as a librarian and, beginning in 2011, as the Director.
`
`Through the course of my employment, I have become well informed about the
`
`operations of the University of Wisconsin library system, which follows standard library
`
`practices.
`
`This Declaration relates to the dates of receipt and availability of the following:
`
`Thurber, K. J., & Wald, L. D. (1975). Associative and parallel
`processors. ACM Computing Surveys, 7(4), 215—255.
`
`Standard operating procedures [or materials at the Universifl. of Wisconsin-
`
`Madison Libraries. When an issue was received by the Library, it would be checked in,
`
`stamped with the date of receipt, added to library holdings records, and made available
`
`to readers as soon after its arrival as possible. The procedure normally took a few days
`
`or at most 2 to 3 weeks.
`
`Exhibit A to this Declaration is true and accurate copy of the issue cover with
`
`library date stamp of ACM Computing Surveys (1975), from the University of
`
`Wisconsin-Madison Library collection. Exhibit A also includes an excerpt of pages 215
`
`to 255 of that issue, showing the article entitled Associative and parallel processors
`
`INTEL - 1011
`
`INTEL - 1011
`
`

`

`Declaration of Rachel J. Watters on Authentication of Publication
`
`(I975). Based on this information, the date stamp on the journal title page indicates
`
`Associative and Parallel Processors (I975) was received by the Engineering Library at
`
`the University of Wisconsin-Madison on February 25, I976.
`
`Based on the information in Exhibit A, it is clear that the volume was received by
`
`the library on or before February 25, 1976, catalogued and available to library patrons
`
`within a few days or at most 2 to 3 weeks after February 25, 1976.
`
`I declare that all statements made herein of my own knowledge are true and that
`
`all statements made on information and belief are believed to be true; and further that
`
`these statements were made with the knowledge that willful false statements and the like
`
`so made are punishable by fine or imprisonment, or both, under Section IOOI of Title 18
`
`of the United States Code.
`
`Date: January 28, 2020
`
`Wisconsin TechSearch
`
`Memorial Library
`728 State Street
`
`Madison, Wisconsin
`
`53706
`
`/%Ra
`
`el J . Watters
`
`Director
`
`INTEL - 1011
`
`INTEL - 1011
`
`

`

`«is
`
`f
`
`“”19
`
`Imber
`
`zcember 1975
`
`UW - MAD:
`
`Special Issue: Computer
`Systems Architecture
`
`D. Wald
`
`|. Organick
`
`Patil
`
`W. Keller
`
`A Anderson
`0 Jensen
`
`r ‘
`
`J. Thurber
`
`INTEL - 1011
`
`

`

`COMPUTING SURVEYS
`
`The Survey and Tutorial Journal of the ACM
`
`Volume 7 Number 4 December 1975
`
`Editor in Chief
`ELLIOTT l. ORGANICK
`Department of Computer Science
`University of Utah
`Salt Lake City, Utah 84112
`
`Executive Editor
`LEE REVENS
`1133 Avenue of the Americas
`New York, New York 10036
`
`The Surveys Editorial Board (Terms through 1978)
`HERBERT S. BRIGHT
`RAYMOND E. MILLER
`Computation Planning, Inc.
`IBM
`THOMAS A. CHEATHAM
`Harvard University
`
`GERARD SALTON
`Cornell University
`
`FERNANDO J. CORBATO
`Project MAC/MIT
`PETER J. DENNING
`Purdue University
`
`JACK B. DENNIS
`MIT
`RICHARD KARP
`University
`of
`Berkeley
`
`MARTIN H. SCHULTZ
`Yale University
`
`PETER WEGNER
`Brown University
`
`California.
`
`ERIC A. WEISS
`Sun Services Corp.
`
`Editorial Staff
`Associate Editor: Victoria Gillen
`Administrative Assistant: Mary Neumeyer
`
`Surveys and tutorials are solicited. Submittals should be addressed, in triplicate, to
`one of the editors. Manuscripts,
`including bibliographic material, must be typed
`double spaced, with the original on bond paper.
`Correspondence about accepted papers should be addressed to the Executive
`‘Editor.
`Subscription rates (annual): ACM members, $700 (single copies $3.00): non-
`members, $30.00 (single copies $8.00).
`When payment is included with membership applications, renewal notices. or in-
`voices, please use the address Association for Computing Machinery. P. O. Box
`12105, Church Street Station, New York, NY 10249.
`Notices of address changes and requests for membership applications and sub-
`scription information shOUId be addressed to Association for Computing Machinery.
`1133 Avenue of the Americas, New York, NY 10036. Allow 6 to 8 weeks for change
`of name and address or new membership to become effective. Send old label with
`new address notification. To avoid interruption of service, notify your local Post
`Office before change of residence: for a fee, the Post Office will forward 2d and 3d
`class periodicals.
`Advertising: ACM, 1133 Avenue of the Americas. New York, N. Y. 10036. (212)
`265—6300.
`
`Published quarterly in March, June. September, and December at Mt. Royal &
`Guilford Aves. Baltimore. Md. 21202 by the Association for Computing Machinery,
`Inc. Copyright © 1975 Association for Computing Machinery,
`Inc. Second-class
`postage paid at New York, New York and at additional mailing offices.
`
`ASSOCIATION FOR
`COMPUTING MACHINERY
`Founded in 1947 as the society of the
`computing community. the
`Association for Computing
`Machinery is dedicated to the
`development of information
`proeessing as a discipline, and to
`the responsible use of computers
`in an increasing diversity of
`applications.
`
`COUNCIL
`Jean E. Sammet, President
`Herbert R. J. Grosch, Vice President
`John W. Hamblen, Secretary
`Aaron Finerman, Treasurer
`John A. Gosden, Chairman,
`Publications Board
`David H. Brandin, Chairman,
`SIG/SIC Board
`Anthony Ralston, Past President
`Members-at-Large
`Herbert S. Bright (1974776)
`Anita Cochran (197446)
`Peter J. Denning (197448)
`M. Stuart Lynn (1974»76)
`Daniel D. McCracken (1974778)
`Peter Wegner (1972-76)
`Regional Representatives
`Thomas Angeli, Allegheny Region
`(1973‘76)
`George G. Dodd, East Central Region
`(1973-76)
`Barry Gordon, Greater New York Re-
`gion (1973-76)
`Carl Hammer, Capital Region (1974—
`77)
`Fred H. Harris, N. Central Region
`(1972—75)
`Willard J. Holden, Pacific Region
`(1972—75)
`Horst Hiinke, European Region (1974'
`77)
`Olin G. Johnson, S. Central Region
`(1972775)
`Roger L. Mills, S, California Region
`(1974—77)
`Gerard A. Salton, Northeast Region
`(1972—75)
`John L. Tischhauser, Mountain Region
`(1974777)
`Larry L. Bell, Southeast Region
`(1975»76)
`
`EXECUTIVE DIRECTOR
`Sidney Weinstein
`EXECUTIVE SECRETARY
`Irene Hollister
`PUBLICATIONS BOARD
`John A. Gosden, Chairman
`Aaron Finerman
`Patrick C. Fischer
`Willard J. Holden
`Raymond E. Miller
`B. L. Trippett
`Martha E. Williams
`
`OTHER ACM PUBLICATIONS
`Communications of the ACM
`Computing Reviews
`Journal of the Association for
`Computing Machinery
`Transactions on Mathematical
`Software
`Collected Algorithms
`ACM Monograph Series
`Proeeedings of Conferences
`Special Publications
`
`INTEL - 1011
`
`INTEL - 1011
`
`

`

`Associative card Parallel Proceuan
`
`KENNETH J. THURBER
`
`Product Development Group, Sperry Univac Defense Systems, St. Paul, Minnesota 55165
`Computer, Information and Control Sciences Department, University of Minnesota,
`Minneapolis, Minnesota 55455
`
`LEON D. WALD
`
`Test Equipment Engineering, Government and Aeronautical Products Division, Honeywell, Inc.,
`Minneapolis, Minnesota
`
`This paper is a tutorial survey of the area of parallel and associative processors.
`The paper covers the main design tradeoffs and major architectures of SIMD
`(Single Instruction Stream Multiple Data Stream) systems. Summaries of
`ILLIAC IV, STARAN, OMEN, and PEPE, the major SIMD processors, are
`included.
`
`Keywords and Phrases: associative processor, parallel processor, OMEN,
`STARAN, PEPE, ILLIAC IV, architecture, large-scale systems, SIMD
`processors, array processors, ensemble.
`
`CR Categories.“ 3.80, 4.20, 6.22
`
`INTRODUCTION
`
`Purpose and Scope
`
`to present a
`this paper is
`The purpose of
`tutorial survey on the subject of parallel and
`asociative processors. The paper covers the
`topics of system categorizations, applications,
`main tradeoff
`issues, historically important
`architectures, and the architectures of systems
`that are currently available.
`Currently, the microprocessor/computer—on-
`a-chip revolution is providing the potential for
`production of very cost-effective high-
`performance computers through utilization of a
`large number of these processors in parallel or
`in a network. The parallel and associative proc-
`essors provide a class of architectures which can
`be readily used to take immediate advantage of
`the microprocessor technology.
`
`Surveys
`
`A number of good surveys on the subject (or
`bordering on the subject) of parallel and asso-
`
`ciative processors have been published [1-10].
`Some of these surveys are only of historical
`interest due to their age. Several conferences in
`this area may also be of interest [1 1-14] .
`
`Classification of Computer Architectures
`
`Many approaches to classification of computer
`architectures are possible. Most techniques use
`global architecture properties and are thus valid
`only within limited ranges. The main classifica-
`tion techniques discussed below provide a
`framework within which to view associative and
`parallel processors.
`Flynn [15], proposed a classification scheme
`that divides systems into categories based upon
`their procedure and data streams. The four
`categories are:
`l) SISD (Single Instruction Stream Single
`Data Stream) — a uniproeessor such as a
`single processor IBM 360.
`2) MISD (Multiple Instruction Stream Single
`Data Stream) — a pipeline processor such
`as CDC STAR.
`
`Copyright © 1976, Association for Computing Machinery, Inc. General permission to republish,
`but not for profit, all or part of this material is granted, provided that ACM’s copyright notice is
`given and that reference is made to this publication, to its date of issue, and to the fact that re—
`printing privileges were granted by permission of the Association for Computing Machinery.
`
`INTEL - 1011
`Computing Surveys, Vol. 7, No. 4. December 1975
`
`INTEL - 1011
`
`

`

`216
`
`-
`
`Kenneth J. Thurber and Leon D. Wald
`
`
`
`CONTENTS
`
`INTRODUCTION
`Purpose and Scope
`Classification of Computer Architectures
`SW75 .
`.
`Definitions
`Reasons for Use of SIMD Processors
`
`.
`.
`A’P‘.‘°‘"‘°“‘.
`Matnx Multiplication on a Parallel Processor
`SIMD GENEALOGY
`MAIN TRADEOFF ISSUES
`Associative Addressing
`The State Transition Concept and Associative
`Processor Algorithms
`:zciigfipflemem (PE) interconnections
`u
`Host interaction
`Memory Distribution
`Software
`Activity Control and Match Resolution
`Balance Between Logic and Memory
`ASSOCIATIVE PROCESSORS
`Introduction
`DLM (Distributed Logic Memory)
`Highly Parallel Associative Processors
`Mixed Mode and Multidimensional Memories
`Implemented Associative Processors
`Associative Processor Simulators
`PARALLEL PROCESSOR AND ENSEMBLES
`introduction
`Unger’s Machine
`SOLOMON l, SOLOMON Ii, and ILLIAC iV
`Machines With Other interconnection Structures
`Orthogonal Computer
`PEPE
`Comment
`ACTUAL MACHINES
`introduction
`STARAN
`OMEN
`PEPE
`iLLiAC iV
`CONCLUSION
`REFERENCES
`
`+—
`
`3) SIMD (Single Instruction Stream Multiple
`Data Stream) — an associative or parallel
`processor such as iLLiAC IV.
`4) MIMD (Multiple Instruction Stream
`Multiple Data Stream) — a multiprocessor
`such as a Univac i 108 multiprocessor
`system.
`Murtha and Beadles [16] proposed a classifi-
`
`parallelism
`
`upon
`based
`technique
`cation
`properties. Their categories are:
`l) General-purpose network computer;
`2) Special-purpose network with global
`parallelism; and
`3) Non—global,
`semi-independent network
`with local parallelism — this category is a
`catchall for machines which do not fit
`.
`.
`into the first two categories.
`Categones
`l
`and 2 have the foilowmg
`SUbCfiSCS:
`
`l.a) General-purpose network with cen-
`.
`tralrzed common control;
`l.b) G e n eral-purpose network with
`.d
`ti
`1
`b t
`. d
`d
`t
`1 en ca processors
`Ll
`_ m epen en
`instruction execution actions;
`2.3) Pattern processors;
`e
`a
`2.b) Assocratrve processors.
`The purpose of this classification technique
`was to differentiate between multiprocessors
`and highly parallel organizations.
`sug-
`Another possible classification view,
`gested by Hobbs, et al.; [1] consists of the
`following categories:
`i) Multiprocessor
`2) Associative processor
`3) Network or array processor, and
`4) Functional machines.
`Further, Hobbs, et al.; suggested that archi-
`tectures could be classified based upon the
`amount of parallelism in:
`1) Control
`2) Processing units, and
`3) Data streams.
`However, it was noted that these parameters
`were present in all highly parallel machines and
`were therefore not
`adequate to define a
`machine architecture.
`
`In his article against parallel processors,
`Shore
`[17] presents a unique classification
`technique which derives the machine descrip-
`tions from the description of a uniprocessor.
`The machine categories considered are sum-
`marized as follows:
`
`1) Machine I — a uniprocessor.
`2) Machine II - a bit-slice associative proc-
`essor built from Machine 1 by adding bit-
`slice processing and access
`capability
`(e.g., STARAN).
`3) Machine III —an orthogonal computer
`derived from Machine 11 by adding par-
`allel word processing and access capa-
`bility (e.g., OMEN).
`
`Computing Surveys, Vol. 7, No. 4, December 1975
`
`INTEL - 1011
`
`INTEL - 1011
`
`

`

`Associative and Parallel Processors
`
`-
`
`217
`
`4) Machine IV—a machine derived from
`Machine I by replicating the processing
`units (e.g., PEPE).
`5) Machine V —- a machine derived from
`Machine IV by adding interconnections
`between processors (e.g., ILLIAC IV).
`6) Machine VI—a machine derived from
`Machine I by integrating processing logic
`into every memory element (e.g., Kautz’s
`logic-in-memory computers [85, 86]).
`Higbie
`[67]
`classifies computers using
`Flynn's
`four basic categories. However, he
`expands the SIMD category into the following
`four subcategories:
`1) Array Processor — a processor that proc-
`esses data in parallel and addresses the
`data by address instead of by tag or value.
`2) Associative Memory Processor — a
`processor that operates on data addressed
`
`by tag or value rather than by address
`(Note:
`this definition does not require
`parallel operation; however,
`the defini-
`tion does allow for machines that operate
`on data in parallel).
`3) Associative Array Processor — a processor
`that
`is associative and also operates
`on arrays of data (typically. the opera-
`tions are on a bit-slice basis; i.e., a single
`bit of many words).
`4) Orthogonal Processor — a processor with
`two subsystems — an associative
`array
`processor subsystem and a serial (SISD)
`processor
`subsystem — which share a
`common memory array.
`Higbie’s categories provide for the identifica-
`tion of the ability to perform parallel, asso-
`ciative, and serial processing. Higbie defines a
`parallel processor to be any computer that con-
`tains multiple arithmetic units and operates on
`multiple data streams. Clearly, all of Higbie’s
`four subcategories of SIMD processors fit his
`definition of a parallel processor.
`Although none of the classification schemes
`presented here are mathematically precise, they
`are necessary to place associative and parallel
`processors in perspective. In the next section
`the terminology of associative and parallel
`processors will be defined for use in the re-
`mainder of this paper.
`
`Definitions
`
`The definitions used for the remainder of this
`
`paper are based upon Flynn [15] , and are con-
`
`sistent with those used by Thurber [10]. The
`definitions are:
`
`any computer with a
`1) SIMD machine:
`single global control unit which drives
`multiple processing units, all of which
`either execute or
`ignore the current
`instruction.
`2) Associative Processor: any SIMD machine
`in which the processing units (or proc-
`essing memory)
`are addressed by a
`property of the data contents rather than
`by address (i.e., multiply A and B to-
`gether in all elements where A <B).
`3) Parallel Processor: any SIMD machine in
`which the processing elements are the
`order of complexity of current day small
`computers and which has, typically, high
`level of interconnectivity between proc-
`essing elements.
`4) Ensemble: a parallel processor in which
`the interconnection level between proc-
`essing elements is very low or nonex-
`istent.
`
`Clearly, the intersection of these definitions
`will not be null due to the overlap in classi-
`fying machine architectures.
`
`Reasons for Use of SIMD Processors
`
`There are many reasons for the use of SIMD
`architectures. Some of the most important are:
`1) Functional
`a) Large problems such as weather data
`pr0cessing.
`b) Problems with inherent data structure
`and parallelism.
`c) Reliability and graceful degradation.
`d) System complexity.
`e) Computational load.
`2) Hardware
`a) Better use of hardware on problems
`with large amounts of parallelism.
`b) Advent of LSI microprocessors.
`c) Economy of duplicate structures.
`d) Lower nonrecurring and recurring
`costs.
`
`3) Software
`a) Simpler than for multiprocessor.
`b) Easier to construct large systems.
`c) Less executive function requirements.
`However, the reader is cautioned to remem-
`ber that these are special-purpose machines and
`any attempt to apply them to an incorrectly
`
`Computing Surveys, Vol. 7, No. 4, December 1975
`INTEL - 1011
`
`INTEL - 1011
`
`

`

`218
`
`-
`
`Kenneth J. Thurber and Leon D. Wald
`
`sized, or designed, problem is an exercise in
`futility.
`
`Applications
`
`Numerous applications have been proposed for
`associative and parallel processors. Whether an
`application is ever implemented is based upon
`the economics of the problem. It is not enough
`to be able to describe a problem solution. To
`date, considerable energy has been expended
`searching for cost-effective parallel solutions,
`but little has been done to actually implement
`them. Examples of
`such applications are:
`matrix manipulation, differential equations,
`and linear programming.
`Several areas of application appear quite well
`suited to associative and parallel processing. In
`some cases, SIMD processors provide cost-
`effective system augmentation (e.g., air traffic
`control and associative head-per—track disks). In
`others, SIMD processors are very close to func-
`tional analogs of the physical system (e.g., data
`compression, concentration, and multiplexing).
`The use of associative and parallel processors
`appears promising in the elimination of critical
`bottlenecks in current general-purpose com—
`puter systems; however, only very small asso-
`ciative memories are required, and cost is really
`not a critical factor in the solution of these
`
`problems. Such problems may be encountered
`in the management of computer resources, and
`might involve such items as protection mecha-
`nisms,
`resource allocation, and memory
`management.
`The application trend for associative
`processors is well defined. Due to cost factors,
`applications will be limited (in the near future)
`to problems such as
`resource management,
`virtual memory mechanisms, and some aug-
`mentation of current systems, rather than to
`data-base management or large file searching
`and processing. The application trend for
`parallel processors and ensembles seems to be in
`the area of large data computation problems
`such as weather data processing, nuclear data
`processing, and ballistic missile defense. Re-
`searchers must concentrate on systems aspects
`of the problem, not on developing solutions to
`problems which are ill-defined or nonexistent.
`The only truly useful data-processing applica-
`tions of parallel and associative processors have
`been developed through extensive work at the
`
`system
`
`followed by parallel
`level,
`systems
`design, rather than vice versa.
`SIMD procesors have been proposed am
`.appear well suited for a number of applications
`these are listed here along with their primary
`references. Factors to be considered in applying
`SIMD processors have been discussed elsewhere
`[18]. To be useful for solving the listed prob-
`lems,
`the highly parallel processors must
`become more
`cost-effective. The suggested
`application areas are:
`1) Applications in which associative proc-
`essors appear to be cost-effective:
`[19]
`a) Virtual memory mechanisms
`[20]
`b) Resource allocation
`[21]
`c) Hardware executives
`[22,23]
`d) Interrupt processing
`[19]
`e) Protection mechanisms
`[20]
`f) Scheduling
`2) Applications in which parallel processors
`appear cost-effective and in which
`associative processors may be
`cost-
`effective:
`
`[24]
`a) Bulk filtering
`[25]
`b) Tracking
`[26]
`c) Air traffic control
`[27]
`d) Data COmpIBSiOI‘l
`e) Communications multiplexing [28,29]
`f) Signal processing
`[30]
`3) Applications in which associative proc-
`essors would be useful if they were more
`cost-effective and in which parallel proc-
`essors are probably cost-effective are:
`3) Information retrieval
`b) Sorting
`c) Symbol manipulation
`d) Pattern recognition
`e)'Picture procesing
`f) Sonar processing
`g) Sea surveillance
`h) Graph processing
`i) Dynamic programming
`j) Differential equations
`‘ k) Eigenvector
`1) Matrix operations
`m) Network flow analysis
`n) Document retrieval
`0) Data file manipulation
`p) Machine document translation
`q) Data file searching
`r) Compilation
`s) Formatted files
`t) Automatic abstracting
`
`[31]
`[32]
`[33]
`[34]
`[35]
`[36]
`[37]
`[38]
`[39]
`[40]
`[41]
`[42]
`[43]
`[44]
`[45]
`[46]
`[47]
`[48]
`[49]
`[50,51]
`
`Computing Surveys, Vol. 7, No. 4, December 1975
`
`INTEL - 1011
`
`INTEL - 1011
`
`

`

`Associative and Parallel Processors
`
`-
`
`219
`
`[52]
`u) Dictionary-look-up translations
`[53]
`v) Data management
`[54]
`w) Theorem proving
`[55]
`x) Computer graphics
`[56]
`y) Weather data processing
`Hollander’s paper
`[2]
`presents many
`experts’ opinions about
`the applicability of
`associative and parallel processors. Slotnick
`[57]
`and Fuller
`[58] have compared asso-
`ciative and parallel processors. They concluded
`that parallel processors appear more useful than
`associative processors, but the machines would
`always be special purpose. Shore [17] has pre-
`sented a very eloquent
`case against SIMD
`machines and parallel processors.
`
`Matrix Multiplication on a Parallel Processor
`
`Matrix manipulations can be used to illustrate
`the potential of a parallel processor. One of the
`major computations in many of the possible
`applications cited previously is
`the matrix
`multiply. Assume that a parallel processor such
`as that shown in Figure 1
`is used to perform
`this operation.
`Each processing element will be assumed to
`have three registers, AREG, BREG, and CREG.
`
`wNTROL
`UNIT
`
`uEmRv TO AND FROM ALL PEr
`
`
`
`IIUOO‘—
`
` L.
`f
`
`1_.i
`
`
`
`cfg
`
`III!
`
`Ill!
`
`.. . ..
`
`Figure 1. Parallel Processor with PE connec-
`tions to the Four Nearest Neighbors.
`
`Each cell
`
`is interconnected to its four nearest
`
`neighbors. Cannon [59] derived the following
`algorithm to multiply two n x n matrices to-
`gether in n stages using this processor.
`
`Algorithm:
`Set:
`
`CREG = O
`
`BREG [PEI J] = B [1,1] for all
`I,J <N
`’
`AREG [PEI J] = [I,J] for all
`LJ szrr
`’
`lth row of A left [-1 columns
`for all I < N
`Jth column of B up 1-1 rows
`for all J <N
`(TREG = AREG times BREG)
`In Parallel in all PEs
`
`(CREG = CREG + TREG)
`In Parallel in all PEs
`
`AREG right one row
`BREG down one column
`If not Nth pass, Jump to
`Multiply:
`
`Shift:
`
`Multiply:
`
`Add:
`
`Shift:
`
`Jump:
`
`As an example let:
`
`31
`
`a4
`
`a7
`
`b1
`
`b2
`
`b3
`
`32
`
`a5
`
`a8
`
`b4
`
`b5
`
`b6
`
`33‘
`
`a6
`
`39
`
`b7
`
`b8
`
`b9
`
`A =
`
`B =
`
`and
`
`C = A x B,
`
`After initialization, the memory map for CREG
`is:
`
`0
`
`0
`
`0
`
`0
`
`0
`
`0
`
`For AREG, the memory map is:
`
`31
`
`35
`
`a9
`
`a2
`
`a6
`
`a7
`
`0
`
`0
`
`0
`
`a3
`
`a4
`
`38
`
`Computing Surveys, Vol. 7, No. 4, December 1975
`INTEL-1011
`
`INTEL - 1011
`
`

`

`220
`
`-
`
`Kenneth J. Thurber and Leon D. Wald
`
`And the BREG map is:
`
`b1
`
`b2
`
`b5
`
`b6
`
`b9
`
`b7
`
`BREG =
`
`b3
`
`b1
`
`b2
`
`b4
`
`b5
`
`b6
`
`b8
`
`b9
`
`b7
`
`After the multiply, add, Shift, and jump, the After two more iterations the multiply will be
`memOW maps appear as MOW
`finished and CREG [PEI J I will contain C[I,J 1.
`
`CREG =
`
`AREG =
`
`albl
`
`asbz
`
`a9b3
`
`a3
`a4
`a
`
`8
`
`a2bs
`
`a6b6
`
`a7b4
`
`a1
`a5
`a
`
`9
`
`a3b9
`
`a4b7
`
`a8b8
`
`a2
`a6
`a
`
`7
`
`SIMD GENEALOGY
`
`Figure 2 indicates the major parallel and asso-
`ciative architectures considered in this paper,
`together with their generic relationships. These
`may differ substantially, yet they are related by
`virtue of
`their global associative or parallel
`.
`properties.
`
`.
`
`SIMD MACHINES
`
`ASSOCIATIVE PROCESSORS
`
`PARALLEL PROCESSORS
`
`ENSEMBLES
`
`wLOMONI
`(108)
`J,
`SOLOMON II
`(109)
`
`ORTHOGONAL
`COMPUTER
`(105)

`OMEN
`
`(112)
`
`UNGER
`MEMEX (72)
`MACHINE
`non
`—l:;j
`CRYOGENIC CATALOG MEMORY (73)
`
`
`DISTRIBUTED LOGIC MEMORY (31)
`l
`V
`ILLIAC IV
`
`(57)
`T—Tfi—
`
`ASSOCIATION
`TREE
`TWO
`
`STORING
`CHANNEL
`DIMENSIONAL
`IMPLEMENTED
`PROCESSOR
`PROCESSOR
`AND BULK
`
`ASSOC)AT)VE
`(78)
`(80)
`DISTRIBUTED
`
`PROCESSORS
`LOGIC
`
`MEMORIES
`
`(81)
`
`
`
`HIGHLY PARALLEL
`
`PEPE
`(115)
`
`T—fi—fl
`BIT
`ADDER
`EXOR
`BYTE
`DISTRIBUTED
`SLICE
`SKEWED
`SKEWED
`SLICE
`LOGIC
`(113)
`BIT
`BIT
`(106)
`(20)
`SLICE
`SLICE
`(92)
`(93)
`
`f—l—‘T
`AUGMENTED
`ASSOCIATIVE AsmCIATIVE
`CONTENT
`CONTROL
`LOGIC
`ADDRESSABLE
`SNITCH
`PROCESSING
`MEMORY
`(24)
`SYSTEM
`(85)
`(87)
`
`
`
`J,
`STARAN
`(114)
`
`MIXED MODE AND MULTIDIMENSIONAL (23)
`
`Figure 2. SIMD Geneology.
`
`Computing Surveys, Vol. 7, No. 4, December 1975
`
`INTEL - 1011
`
`INTEL - 1011
`
`

`

`Associative and Parallel Processors
`
`-
`
`221
`
`MAIN TRADEOFF ISSUES
`
`To allow the reader to place the processed
`architectures in perspective, the major system
`tradeoff issues in SIMD processors will be dis-
`cussed next. It should be understood that SIMD
`
`processors are special purpose. They are useful
`for a limited set of applications which may be
`characterized as having [18] :
`l) A large number of independent data sets;
`2) No relationships between data sets that
`prevent
`them from being processed in
`parallel;
`3) The requirement
`response; and
`to exploit
`4) The potential
`addressing selection techniques.
`
`for quick throughput
`
`associative
`
`Associative Addressing
`
`Figure 3 indicates the main features required to
`implement an associative address process. The
`designer must
`include facilities to address or
`query the processor associatively,
`i.e.,
`to
`
`
`
`GENERAL FORM OF THE
`FIELD DESCRIPTOR
`
`EMPLOYEE
`DAILY
`SALARY NUMBER
`
`address the memory array by content. In Figure
`3,
`the application may require that all em-
`ployees with a salary of $35 per day be located.
`This would be accomplished by performing an
`equality search on the daily salary field of the
`file. This search is performed in parallel. To set
`up the search a Data Register must be loaded
`with the daily salary ($35) for comparison, a
`Mask Register
`is included to mask the data
`register so that only the desired fields may be
`searched, a Word Select (activity select) Reg-
`ister specifies the subset of words to be ad-
`dressed, a Results Register is required to collect
`the results of the search, a Match Indicator is
`used to indicate the number of matches, and a
`Multiple Match Resolver must be provided to
`generate a pointer to the “topmost” matched
`word. An eight-word example of a personnel
`file stored in a simple associative memory is
`illustrated in Figure 3;
`it assumes that
`the
`search for a salary of $35 per day was initiated
`on the entire file. All necessary registers and
`features are illustrated.
`
`MATCH
`IND ICATO R
`
`DATA REGISTER
`
`MASK REGISTER
`
`MULTIPLE
`MATCH
`REwLVER
`
`Ill
`
`WORD
`SELECT
`REGISTER
`
`SEARCH
`RESULTS
`REGISTER
`
`EXTERNAL
`PROCESSING
`LOGIC
`
`Figure 3. Associative Processor Storing a Personnel File.
`
`Computing Surveys, Vol. 7I figr‘éll-Jecemaei *975
`
`INTEL - 1011
`
`

`

`222
`
`-
`
`Kenneth J. Thurber and Leon D. Wald
`
`The State Transition Concept and Associative
`Processor Algorithms
`
`In 1965, Fuller and Bird [35] described a bit-
`slice associative processor architecture. Fuller
`and Bird discussed their algorithms in terms of
`state transitions. To perform parallel processing
`of data in an associative processor, two basic
`operations are required. These are a content
`search and a “multiwrite” operation. The multi-
`write operation consists of
`the ability to
`simultaneously write into selected bits of
`several selected words. With a content search
`
`and multiwrite capability, an associative proc-
`esor can be used to process data in highly
`parallel fashion.
`the problem of
`For
`example, consider
`adding field A to field B. This can be accom-
`plished as a sequence of bit pair additions (Ai
`added to Bi), with suitable attention paid to the
`carry bit. If bit 1
`is the least significant and bit
`n the most, Figure 4 indicates the state table
`for the bit-serial addition of A to B with the
`
`result accumulating into B. As shown Bi (result)
`and Ci + 1 differ from Bi (operand) and Ci in
`the four states, 1, 3, 4, and 6. One possible
`addition method would be to search for these
`
`states and multiwrite the correct Bi and Ci + 1
`into the appropriate fields, i.e., set up a “tag
`field” designated as Ci, and set Cl = 0. Then
`operations would proceed from the least sig-
`nificant bit to the most significant bit of the
`pair of operands. At each bit position, i, Ai, Bi,
`and Ci would be searched for state 1. Then
`Ci + 1 = 0, Bi = 1 would be multiwritten into
`all matching words. After a search for state 3,
`Ci + 1 = 1, Bi = 0 would be multiwritten into all
`matching words. This same procedure would be
`carried out
`for states 4 and 6.
`if a word is
`matched on any search, it would be removed
`from the list of active cells until operations
`moved from bit position i
`to i + 1, so that
`
`writing the resultant Bi and the carry-in for bit
`position i + 1 into the memory does not cause
`the data already processed to be reprocessed.
`Obviously, utilizing the sequential state trans-
`formation approach, any desired transforma-
`tions can be processed in an associative
`processor with simple content search and multi-
`write capability using very simple word logic.
`Operating on the state transition concept, six
`general requirements of the architecture of an
`associative procesor were deduced. These are:
`
`PRESENT
`STATE
`
`PARTIAL SUM
`
`
`
`' CARRY INTO BIT POSITION i+I FROM BIT POSITION 1'.
`
`Figure 4. Bit-Slice Addition State Transition
`Diagram.
`
`the
`1) The basic processing primitive is
`identification of all words meeting some
`binary state variable configuration;
`2) No explicit addressing is necessary since
`processing
`can
`be performed simul-
`taneously over all data sets of interest;
`3) Data is processed “column serially” to
`limit the number of state transformations
`to be remembered;
`4) Tag bits (temporary bit-slice storage) are
`very useful (for example, the carry bit);
`5) A search results/word select register and
`multiword write capability are desirable;
`and
`
`i
`
`6) Stored program control of the associative
`processor is desirable.
`Fuller and Bird recognized an important
`correlation between the way in which data is
`stored in the processor’s memory and its proc-
`essing speed. As examples of this phenomenon,
`they described both word per cell (W/C) and bit
`per
`cell
`(B/C) data organizations for
`their
`aSSOciative processor. Figures 5 and 6 show the
`W/C and B/‘C storage organizations for nine
`four-bit words X11,
`, X33, respectively.
`The word per cell organization has become an
`industry standard, and it
`is, in fact, the way
`most associative processors store data. The
`desire to perform bit-slice processing on many
`data sets simultaneously without much inter-
`word communication led to the widespread use
`of this type of organization. W/C is very effi-
`cient if a large number of cells are involved;
`
`Computing Surveys, Vol. 7, No. 4, December 1975
`
`INTEL - 1011
`
`INTEL - 1011
`
`

`

`Associative and Parallel Processors
`
`-
`
`223
`
`BITS 0 TO 3 OF
`
`
`
`
`
`Figure 5. Word Per Cell Organization.
`
`operands can be stored in the same cell, and
`most operands are being processed by each bit-
`slice command. For example, in Figure 5, this
`type of organization would be very efficient if,
`for all operands, Xil, xi2, Xi3, it were desired
`to calculate X11 + “Q * xi3) for all i. In this
`case all cells could be activated and the cal-
`
`culation could be performed in a bit-serial
`manner over all operands.
`However, there are problems in which a B/C
`approach is more efficient. (The B/C approach
`is the same as described by Crane and Githens
`[76] , and was discovered independently by
`Fuller and Bird [35].) The B/C organization
`allows many operands to be processed simul-
`taneously in a bit-parallel mode. Because many
`operands are treated in a bit-parallel fashion,
`B/C processing can be very efficient, even on
`only a few operands, in comparison with W/C
`processing. Also, operands stored in the same
`bit slice can be easily picked up. Therefore, B/C
`processing can be efficient in cases where only a
`few operands are activiated for processing by an
`instruction, or for applications in which exten-
`sive pairing of operands must occur.
`Using the state transition concept, asso-
`ciative processor algorithms may be clearly
`described for a variety of architectures. Con-
`sidering the range of architectures currently
`available, no attempt will be made to describe
`the algorithms here. Rather, a list will be given

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket