`
`by WILLIAM A. WULF and C. G. BELL
`CarnegieMeUm University
`Pittsburgh, Pennsylvania
`
`INTRODUCTION AND MOTIVATION
`
`REQUIREMENTS
`
`I n the Summer of 1971 a project was initiated a t CMU
`to design the hardware and software for a multi-
`processor computer system using minicomputer pro-
`cessors (i.e., PDP-11's). This paper briefly describes an
`overview (only) of the goals, design, and status of this
`hardware/software complex, and indicates some of
`the research problems raised and analytic problems
`solved in the course of its construction.
`Earlier in 1971 a study was performed to examine
`the feasibility of a very large multiproce&or computer
`for artificial intellighce research. This work, reported
`in the proceedings paper by Bell and Freeman, had an
`influence on the hardware structure. I n some sense,
`this work can be thought of as a feasibility study for
`larger multiprocessor systems. Thus, the reader might
`look a t the Bell and Freeman paper for general over-
`view and potential, while this paper has more specific
`details regarding implementation since it occurs later
`and is concerned with an active project. It is recom-
`mended that the two papers be read in sequence.
`The following section contains requirements and
`background information. The next section describes
`the hardware structure. This section includes the
`analysis of important problem in the hardware design:
`interference due to multiple processors accessing a
`common memory. The operating system philosophy,
`and its structure is given together with a detailed anal-
`ysis of one of the problems incurred in the design. One
`problem is determining the optimum number of "locks"
`which are in the scheduling primitives. The final section
`discusses a few programming problems which may
`arise because of the possibilities of parallel processing.
`
`*This work was supported by the Advanced Research Projects
`Agency of the Office of the Secretary of Defense (F44620-70-0107)
`and is monitored by the Air Force Office of Scientific Research.
`
`The CMU multiprocessor project is designed to
`satisfy two requirements :
`
`1. particular computation requirements of existing
`research projects; and
`2. research interest in computer structures.
`
`The design may be viewed as attempting to satsify the
`computational needs with a system that is conserva-
`tive enough to ensure successful construction within a
`two year period while first satisfying this constraint,
`the system is to be a research vehicle for multiprocessor
`systems with the ability to support a wide range of
`investigations in computer design and systems pro-
`gramming.
`The range of computer science research a t CMU
`(i.e., artificial intelligence, system programming, and
`computer structures) constrains processing power, data
`rates, and memory requirements, etc.
`
`(1) The artificial intelligence research a t CMU
`concerned with speech and vision imposes two
`kinds of requirements. The first, common to
`speech and vision, is that special high data rate,
`real time interfaces are required to acquire data
`from the external environment. The second more
`stringent requirement, is real time processing for
`the speech-understanding system. The forms of
`parallel computation and intercommunication
`in multiprocessor is a matter for intensive
`investigation, but seems to be a fruitful approach
`to achieve the necessary processing capability.
`(2) There is also a significant effort in research on
`operating systems and on understanding how
`software systems are to be constructed. Research
`in these areas has a strong empirical and ex-
`perimental component, requiring the design
`and construction of many sy~tems. The primary
`
`IPR2023-00033
`Apple EX1024 Page 1
`
`
`
`766
`
`Fall Joint Computer Conference, 1972
`
`requiremmt of these systems is isolation, so
`they can be used in a complctc4y idiosyncratic
`way and be restructured in terms of software
`from thc basic machine. These systems also
`require access by multiple users and varying
`amounts of secondary memory.
`(3) T h i w is also research interest in using Register
`Transfrr Modulw (RTM's) drvcloprd hew and
`a t Digital Equipmrnt Corporation (Brll, Grason,
`(4 al., 1972) and in production as the PDP-16
`arc drsigned to assist in thc fabrication of hard-
`ware/software systcms. A dedicated facility is
`needed for the dwign and testing of experi-
`mental system constructed of these modules.
`
`TIMELIXESS OF MULTIPROCESSOR
`
`We believe that to assemble a multiprocessor system
`today rcquirm rcwarch on multiprocessors. Multi-
`processor systclms (0tht.r than dual processor struc-
`turps) have not bvcome current art. Possibly reasons
`for this state of affairs are:
`
`1. T h r absolutely high cost of proccwors and
`primary mcmorics. A complcx multiprocc~ssor
`systcm was simply beyond thc computational
`rchalm of all but a few extraordinary users, in-
`drpendcnt of the advantage.
`2. The relativt4y high cost of processors in the
`total system. An additional processor did not
`improve the performance/cost ratio.
`3. The unreliability and performance degradation of
`operating system software,-providing
`a still
`more complrx system structure-would
`be
`futile.
`4. The inability of technology to permit construc-
`tion of the central switches required for such
`structures due to low component density and
`high cost.
`5. The loss of performance in multiprocessors due
`to memory access conflicts and switching delays.
`6. The unknown problems of dividing tasks into
`subtasks to be c.xecuted in parallel.
`7. Tho problems of constructing programs for
`execution in a parallel environment. The possi-
`bility of parallel execution demands mechanisms
`for controlling that parallelism and for handling
`increased programming complexity.
`
`I n summary, the expense was prohibitive, even for
`discovering what advantages of organization might
`overcome any inherent dccrernents of performance.
`However, we appear to have now entered a techno-
`
`logical domain when many of the difficulties listed
`above no longer hold so strongly:
`
`1'. Providing we limit ourselves to multiprocessors
`of minicomputers, the total system cost of
`processors and primary mrmories are now within
`the price range of a research and user facility.
`2'. The procrssor is a smaller part of
`the total
`system cost.
`3'. Software reliability is now somewhat improved,
`primarily because a large number of operating
`systclms havc becn constructed.
`4'. Current medium and large scale integrated
`circuit technology enables the construction of
`switches that do not have the large losses of the
`older distributed decentralized switches (i.e.,
`busses).
`5'. Memory conflict is not high for the right balance
`of processors, memories and switching system.
`6'. There has been work on the problem of task
`parallelism, ccntcred around the ILLIAC I V
`and the CDC STAR. Other work on modular
`programming [Krutar, 1971; Wulf, 19711 sug-
`gests how subtasks can be executed in a pipeline.
`7'. Mechanisms for controlling parallel execution,
`fork-join (Conway, 1963), P and V (Dijkstra,
`1968), have been extensively discussed in the
`literature. Methodologies for constructing large
`complex programs are emerging (Dijkstra, 1969,
`Parnas, 1971).
`
`I n short, the price of experimentation appears rea-
`sonable, given that there are requirements that appear
`to be satisfied in a sufficiently direct and obvious way
`by a proposed multiprocessor structure. Moreover,
`there is a reasonable research base for the use of such
`structures.
`
`RESEARCH AREAS
`
`The above state does not settle many issues about
`multiprocessors, nor make its development routine.
`The main areas of research are:
`
`1. The multiprocessor hardware design which we
`call the PMS structure (see Bell and Xewell,
`1971). Few multiprocessors havc been built,
`thus each one represents an important point in
`design space.
`(i.e.,
`2. The processor-memory
`interconnection
`the switch design) especially with respect to
`reliability.
`
`IPR2023-00033
`Apple EX1024 Page 2
`
`
`
`3. The configuration of computations on the multi-
`processor. There are many processing structures
`and little is known about when they are ap-
`propriate and how to exploit them, especially
`when not treated in the abstract but in the con-
`text of an actual processing system:
`Parallel processing: a task is broken into a
`number of subtasks and assigned to separate
`processors.
`Pipeline
`processing: various
`independent
`stages of the task are executed in parallel
`(e.g., as in a co-routine structure).
`Network processing:
`the computers operate
`quasi-independently with intercommunication
`(with various data rates and delay times).
`Functional specialization: the processors have
`either special capabilities or access to special
`devices; the tasks must be shunted to pro-
`cessors as in a job shop.
`Multiprogramming: a task is only executed
`by a single processor at a given time.
`Independent
`p~seessing: a configurational
`separation is achieved for varying amounts
`of time, such that interaction is not possible
`and thus doesn't have to be processed.
`4. The decomposition of
`tasks for appropriate
`computation. Detailed analysis and restructuring
`of the algorithm appear to be required. The
`speech-understanding
`system
`is one major
`example which will be studied. I t is interesting
`both from the multiprocessor and the speech
`recognition viewpoints.
`5. The operating system design and performance.
`The basic operating system design must be
`conservative, since it will run as a computation
`facility, however it has substantial research
`interest.
`6. The measurement and analysis of performance
`of the total system.
`7. The achievement of reliable computation by
`organizational schemes a t higher levels, such as
`redundant computation.
`
`T H E HARDWARE STRUCTURE
`
`This section will briefly describe the hardware design
`without explicitly relating each part to the design con-
`straints. The configuration is a conventional multi-
`processor system. The structure is given in Figure 1.
`There are two switches, Smp and Skp, each of which
`jrovide intercommunication among two sets of com-
`ponents. Smp allows each processor to communicate
`with all primary memories (in this case core). Skp
`
`C.mmp-A Multi-Mini-Processor
`
`767
`
`where: ~ c / c e n t r a l processor; ~ p / p r i m a r ~ memory; ~ / t e r m i n a l s ;
`Ks/slow device control ( e . g . , for Teletype) ;
`Kf/fast device control ( e . g . , for d i s k ) ;
`
`~ c / c o n t r o l for clock, timer, interprocessor c m u n i c a t i o n
`
`' ~ 0 t h switches have s t a t i c configuration control by manual and
`program control
`
`Figure I-Proposed CMU multiminiprocessor
`computer/C.mmp
`
`allows each processor (PC), to communicate with the
`various controllers (K), which in turn manage the
`secondary memories (Ms), and I/O devices trans-
`ducers (T). These switches are under both processor
`and manual control.
`Each processor system is actually a complete com-
`puter with its own local primary memory and con-
`trollers for secondary memories and devices. Each
`processor has a Data operations component, Dmap,
`for translating addresses a t the processor into physical
`memory addresses. The local memory serves both to
`reduce the bandwidth requirements to the central
`memory and to allow completely independent opera-
`tion and off-line maintenance. Some of the specific
`components shown in Figure 1 are :
`
`K.clock: A central clock, K.clock, allows precise
`time to be measured. A central time base is
`broadcast to all processors for local interval
`timing.
`K.interrupt: Any processor is allowed to generate
`an interrupt to any subset of the PC configura-
`tion at any of several priority levels. Any pro-
`
`IPR2023-00033
`Apple EX1024 Page 3
`
`
`
`768
`
`Fall Joint Computer Conference, 1972
`
`cessor may also cause any subset of the con-
`figuration to be stopped and/or restarted. The
`ability of a processor to interrupt, stop, or
`restart another is under both program and
`manual control. Thus, the console loading func-
`tion is carried out via this mechanism.
`Smp: This switch handles information transfers
`between primary memory processors and I/O
`devices. The switch has ports (i.e., connections)
`for m busses for primary memories and p busses
`for processors. Up to rnin(m,p) simultaneous
`conversations possible via the cross-point ar-
`rangement.
`Smp can be set under programmed control or
`via manual switches on an override basis to
`provide different configurations. The control
`of Smp can be by any of the processors, but one
`processor is assigned the control.
`Mp: The shared primary memory, Mp, consists
`of (up to) 16 modules, each of (up to) 65k, 16 bit,
`words. The initial memories being used have the
`following relevant parameters : core technology;
`each module is &way interleaved; access time is
`250 nanoseconds; and cycle time is 650 nano-
`seconds. An analysis of the performance of these
`memories within the C.map configuration is
`given in more detail below.
`Skp: Skp allows one or more of k Unibusses (the
`common bus for memory and i/o on an isolated
`PDP-11 system) which have several slow, Ks
`(e.g., teletypes, card readers), or fast con-
`trollers, Kf, (e.g., disk, magnetic tape), to be
`connected to one of p central processors. The k
`IJnibusses for the controllers are connected to
`the p processor Unibusses on a relatively long
`term basis (e.g., fraction of a second to hours).
`The main reasons for only allowing a long term,
`the k
`but switchable, connection between
`Unibusses and the processor is to avoid the
`problem of having to decide dynamically which
`of the p processors manage a particular control.
`Like Smp, Skp may be controlled either by
`program or manually.
`PC: The processing elements, PC, are slightly
`modified versions of the DEC PDP-11. (Any of
`the PDP-11 models may be intermixed.)
`Dmap: The Dmap is a Data operations component
`which takes the addresses generated in the
`processor and converts them to addresses to use
`on the Memory and Unibusses emanating from
`the Dmap. There are four sets of eight registers
`in Dmap, enabling each of eight 4,096 word
`blocks to be relocated in the large physical
`memory. The size of the physical Mp is 220
`
`words (221 bytes). Two bits in the processor,
`together with the address type are used to
`specify which of the four sets of mapping regis-
`ters is to be used.
`
`Dmap
`
`The structure of the address map, is described below
`and in Figure 2 together with its implications for two
`kinds of programs: the user and the monitor programs.
`For the user program, the conventional PDP-11 ad-
`dressing structure is retained-except
`that a program
`does not have access to the "i/o page," and hence the
`full 16-bit address space refers to the shared primary
`memory.
`A PDP-11 program generates a 16-bit address, even
`though the Unibus has 18-bit addressing capability.
`I n this scheme the additional two address bits are
`obtained from two unused program status (PS) register
`bits. (Note, this register is inaccessible to user pro-
`
`ber'. 16-bit addre..
`
`PDP-11
`
`P 5-
`6- 7 'dl.Pl.c-t
`bat,k 00 [I -r--
`
`I
`
`bank
`
`bank
`
`bank
`
`register selection
`
`-- I
`
`!
`
`21-bit
`
`CMUibus Address
`
`8
`
`I
`
`physical page number
`
`ontrol
`
`exteneion
`
`-format:
`
`1
`1
`1
`1
`A ,\ A n
`
`
`
`4
`
`reserved for expansion of phyaieal page n d e r
`L (resemed)
`
`Figure 2-Format of data in the relocation registers
`
`IPR2023-00033
`Apple EX1024 Page 4
`
`
`
`&rams.) These are two additional bits, provides four
`addressing modes :
`
`These addresses are always mapped, and
`always refer to the shared, large, primary
`memory.
`All but 8 kw (kilo words) of this address
`space is mapped as above. The 8 kw of this
`space which is not mapped refers to the
`private Unibus of each processor; 4 kw of
`this space is for private (local) memory and
`4 kw is used to access i/o devices attached
`to the processor.
`
`For mapped references, the mapping consists of using
`the most significant five bits of the 18-bit address to
`select one of 30 relocation registers, and replacing these
`by the contents of the 8 low order bits of that register
`yielding an overall 21-bit address. Alternatively, con-
`sider that two bits of the PS select one of four banks
`of relocation registers and the leftmost three bits of
`the users (16-bit) address select one of the eight regis-
`ters in this bank (six in bank three). A program may
`(by appropriate monitor calls) alter the contents of
`the relocation registers within that bank and thus alter
`;s "instantaneous virtual memory9'-that
`is, the set
`of directly addressable pages. The format of each of the
`30 relocation registers is as also shown in Figure 2
`where:
`
`The 'written-into' bit is set (to 1) by the hard-
`ware whenever a write operation is performed on
`the specified page.
`The 'write protect' bit, when set, will cause a
`trap on (before) an attempted write operation
`into the specified page.
`The NXM, 'non-existent memory', when set,
`will cause a trap on any attempted access to the
`specified page. Note: this is not adequate for,
`nor intended for, 'page fault' interruption.
`The 8-bit 'physical page number' is the actual
`relocation value.
`
`THE MEMORY INTERFERENCE PROBLEM
`
`One of the most crucial problems in the design of
`this multiprocessor is that of the conflict of processor
`requests for access to the shared memories.
`Strecker (1970) gives closed form solutions for the
`interference in terms of a defined quantity, the UER
`,unit execution rate). The UER is, effectively, the rate
`memory references and, for the PDP-11, is approxi-
`mately twice the actual instruction execution rate.
`
`C.mmp-A Multi-Mini-Processor
`
`769
`
`(Although a single instruction may make from one to
`five memory references, about two is the average.)
`Neglecting i/o transfers*, assuming access requests to
`random, and using the following mean
`memories a t
`parameters :
`
`t P
`
`t,,tc
`
`t, = t, - t,
`
`the time between the completion of one
`memory request and the next request
`the access time and cycle time for the
`memories to be used
`the rewrite time of the memory
`
`Strecker gives the following relations:
`
`t, = L: UER = (m/t,) (1 - (1 - l/m)p)
`
`m 1 - (1 - l/m)p
`t, < t,: UER = - X
`1 - (1 - l/m)p
`t
`t, > t,: UER = (m/t,)(l - (1 - P,/m)p)
`
`Various speed processors, various types of memories,
`and various switch delays, td, can be studied by means
`of these formulas. Switch delays effects are calculated
`by adding to t, and t,, i.e., t,' = td + t,; and t,' =
`t d + t,. For example, the following cases are given in
`the attached graphs. The graphs show UER X lo6 as
`a function of p for various parameters of the memories.
`The two values of t d shown correspond to the estimated
`switch delay in two cable-length cases: 10' and 20'.
`The t,,t, values correspond to six memory systems
`which were considered. The value of t, is that for the
`PDP-11 model 20.
`Given data of the form in Figures 3 and 4 it is pos-
`sible to obtain the cost effectiveness of various proces-
`sor-memory configurations. An example of
`this
`information for a particular memory configuration
`(16 memories, t, = 400) and three different processors
`(roughly corresponding to three models of the PDP-11
`family) is plotted in Figure 5. Note that a small con-
`figuration of five Pc.lls has a performance of 4.5 X lo6
`accesses/second (UER). The cost of such a system is
`approximately $375K, yielding a cost-effectiveness of
`12. Replacing these five processors with the same
`number of Pc.3'~ yields a UER of 15 X lo6 for about
`$625K, or a cost-effectiveness of about 24. Following
`this strategy provides a very cost-effective system
`once a reasonably large number of processors are used.
`
`* A simple argument indicates that i/o traffic is relatively
`insignificant, and so has not been considered in these figures. For
`example, transferring with four drums or 15 fixed head disks at
`full rate is comparable to one PC.
`
`IPR2023-00033
`Apple EX1024 Page 5
`
`
`
`770
`
`Fall Joint Computer Conference, 1972
`
`n-ry:
`
`Processor: r - 700 na (PUP-11 model 201
`p - 1,5.10 ,..., 35
`nvmbsr nnnory nodules - 8
`rc,ta - (300,5),(400,250).(650,3501.
`td - 190,270
`
`i900,3501.(1200.5001
`
`accurate method under consideration is to associate
`a small memory with each crosspoint intersection.
`This can be constructed efficiently by having a memory
`array for each of the m rows, since control is on a row
`(per memory) basis. When each request for a particular
`row is acknowledged, a 1 is added to the register cor-
`responding to the procesor which gets the request.
`These data could then serve as input to algorithms of
`the type described under (1). Such a scheme has the
`drawback of adding hardware (cost) to the switch, and
`possibly lowering reliability. Since the performance
`measures given earlier are quite good, even for large
`numbers of processors, this approach does not seem
`justified a t this time.
`
`A cache
`
`Since performance for all but shared programs may
`approximate the random references assumption of
`Strecker's analysis, special provision for these references
`might be provided. The addition of a cache memory
`between Dmap and Smp allows programs to migrate
`
`for various memory-processor
`Figure 3-Performance
`configurations
`
`I n fact, in the range 15-30 processors the cost-ef-
`fectiveness is relatively constant while the absolute
`performance nearly doubles.
`Unfortunately these studies of memory interference
`assume a random distribution of memory references-
`an assumption may be invalid when true parallel
`processing is performed (notably if shared programs
`are executed, as in the operating system). Several
`approaches to predicting and preventing these con-
`flicts are being studied:
`
`Software page-placements
`
`reference patterns may be
`Better-than-random
`achieved by having the operating system page-place-
`ment algorithms attempt to localize process' pages
`within a single memory module. No results on this
`approach have been obtained to date.
`
`Switch, Smp, measurement
`
`Schemes for dynamically measuring the Mp-PC
`reference pattern are being considered. The most
`
`for various memory-processor
`Figure 4-Performance
`configurations
`
`IPR2023-00033
`Apple EX1024 Page 6
`
`
`
`-(I6
`
`proce..or.;
`
`I6 ..lorlo.);
`
`cd: 190 ns; t c : 4 0 0 )
`
`ProC*..~r.:
`p r . 1 ; tp: 100 n.
`k . 2 ; tp: 450 ns
`pc.3; tp:
`
`200 ".
`
`/ //
`
`Figure 5-Cost effectiveness (UER/$)
`
`into the cache thereby diminishing the number of
`requests for a single memory. This also provides faster
`access since the Smp is avoided.
`By introducing such a cache, however, a potential
`problem is created regarding the validity of data since
`it might be possible to have sixteen different values
`of a single variable a t a given instant of time. A scheme
`for avoiding this is to allow only information from
`"read only" pages (especially instructions) to appear
`in the cache. (In particular, the bit marked 'reserved'
`in Figure 2 is used to signal that data from the page
`may be placed into the cache.) Traces of PDP-11
`programs executions indicate that a small cache (25&
`512 words) will capture 70-90 percent of the eligible
`references and 40-50 percent of all references. McCredie
`(1972) has studied the effect of such a cache on overall
`system performance both analytically and by simula-
`tion. The results of these studies indicate an improve-
`ment of 10-40 percent in overall system performance.
`
`THE OPERATING SYSTEM
`
`C.mmp-A Multi-Mini-Processor
`
`771
`
`specifically for multiprocessor environments. I n par-
`ticular, no systems have been built to support the
`variety of process relations (parallel, pipeline, etc.)
`envisioned for C.mmp. Moreover, there is a relative
`lack of experience in organizing computations for
`parallel execution. These facts have driven the operat-
`ing system design to the following, hopefully conserva-
`tive, position:
`
`The operating system will consist of a "kernel"
`and a "standard Extension." The kernel will
`provide a set of mechanisms (tools) for building
`an operating system, but no policies (e.g., no
`scheduler, no file structure, no. . .). The standard
`extension will implement an (easily modified or
`replaced) set of "conventional" operating system
`facilities (e.g., a scheduler, file system, . . .). The
`kernel will support the (simultaneous) execution
`of an (almost) arbitrary number of extensions.
`
`Under this strategy the variety of computational
`--
`structures is not a priori limited by the structure of
`the underlying system. There are also potential hazards
`in the kernel approach. One of them is the possibility
`that extension in some (important) desired direction
`is not possible because of irrevocable decision made too
`early (though this problem is hardly unique to the
`kernel approach). Another hazard is that intolerable
`overhead might accrue by enforced multiple 'layering'
`of extensions. Both analysis and simulated use indicated
`that neither of these problems exist for the proposed
`design.
`The remainder of this section is devoted primarily
`to a description of the kernel (called HYDRA).
`I n considering what set of mechanisms (tools)
`should be provided by an operating system kernel two
`commonly held views of the essential nature of an
`operating system are relevant:
`
`-An operating system creates a '(virtual machine"
`to support (user) programs by providing resources
`and operations not present in the underlying
`hardware (e.g., "files," file "read" and "write"
`operations, etc.).
`-An
`operating system is a resource (virtual and
`physical) manager and allocator.
`
`Note the emphasis in both views on resources; their
`creation, management, and operations on them. From
`these views we infer than an appropriate set of tools
`for building an operating system must provide for:
`
`Although the technology of operating systems has
`made significant progress in the past decade, there are
`virtually no extant examples of systems constructed
`
`-the creation of new virtual resources;
`-the
`'representation' of a new resource in terms of
`existing ones;
`
`IPR2023-00033
`Apple EX1024 Page 7
`
`
`
`The only primitive operations in the system which
`are not provided by procedures are CALL and RE-
`TURN, whose functions are, respectively, to permit
`entry to, and exit from, procedures. CALL also provides
`parameter checking and establishes the Zns for the
`called procedure.
`To recap: The primitive notions in HYDRA are
`those of an object, a global name, and a type. Some
`specific types are TYPE, PROCTYPE, and LNS-
`TYPE. Procedure objects may be invoked by a CALL
`and are exited by a RETURN. Protection is provided
`by: (1) restricting access to objects to those named in
`the current Ins, (2) restricting the operations (pro-
`cedures) which may be applied to an object to those
`associated with that type of object, and (3) further
`restricting the set of operations which may be applied
`to any object named in an Ins to a subset of those in
`(2).
`Figure 6, gives a concrete example of this mecha-
`nism. Suppose that a new type of object, a "bibli-
`ography file," is created. Three specific operations are
`permitted on these objects: updating, printing, and
`erasing. Therefore three procedure objects UPDATE,
`PRINT, and ERASE are created to perform these
`
`Procedure
`
`io. 1
`
`772
`
`Fall Joint Computer Conference, 1972
`
`creation of operations on resources and/or
`-the
`their representation; and
`-protection
`(against illegal operations on a resource),
`both
`(a) uniformly over a class of resources; and
`(b) with regard to specific instances of a re-
`source.
`
`This list serves as the design goals for HYDRA against
`which the design is evaluated. .
`Since the resources are central to the design, we de-
`fine a suitable abstraction of these called an object;
`objects are the basic entity of interest in HYDRA. An
`object has a name, a type, and usually some other
`(type dependent) information associated with it. The
`name of every object is unique and is called its global
`name. There is a supply of unique global names to last
`over the system's total life. Thus, it is not possible for
`two (or more) objects to have the same global name.
`The set of extant objects is partitioned into equiva-
`lence classed by their types. There is also an unlimited
`supply of object types-new
`types may be created at
`will. The initial system includes a particular object
`whose name is TYPE. New types are created by
`creating an object whose type is TYPE; thus a class of
` object,^ of a particular type are "represented" by an
`object of
`type TYPE. Suppose, for example, one
`wished to create a new kind of virtual resource. This
`would be done by creating an object (assume its name
`is X) of type TYPE. The object X now serves as a
`representative for all particular instances of resources
`of this new variety; in particular, objects of type X
`may now be created to represent the instances of the
`new resource.
`Operations are performed on objects by procedures."
`A procedure is an object of type PROCTYPE. The
`'right' to invoke a procedure on each particular object
`is limited by both the type of object and the user's
`access to it (see below).
`During execution of a procedure there exists a local
`name space, Ins, associated with it. The Ins is an object
`which provides a mapping between local object names
`(integers) accessible to the procedure and the actual
`global names for objects. Each Ins entry may also
`restrict the access rights (procedures that can be
`invoked to perform operations on or with the object)
`to a subset of those defined for that type of object.
`Thus the Ins provides both mapping and protection
`functions.
`
`*Here we wish to invoke the reader's intuitive notion of a
`'procedure' and its properties, e.g., a body of code, local storage,
`a parameter mechanism, etc.
`
`Figure &Example of LNS mapping and protection
`
`IPR2023-00033
`Apple EX1024 Page 8
`
`
`
`operations; no other operations are permitted on this
`type of object. The situation in Figure 6 might exist
`a t some instant. It shows (in the center) two pro-
`cedures, A and B, and their associated Ins's-directed
`arcs indicate the mapping function of the Ins and the
`letters along an arc indicate permitted accesses. Here,
`local name '1' of procedure A references a particular
`bibliography object, B1; UPDATE and PRINT access
`by A are permitted. The following information can be
`observed from the diagram:*
`
`-A.O -+ UPDATE
`-B.O -+ UPDATE
`-A.2 -+ PRINT
`(and so on; note that A cannot name the ERASE
`procedure nor bibliography object B2)
`-A may: update and print B1; only print B2
`-B may: only print B1; update, print, or erase B2;
`update and print B3.
`
`T H E RELIABILITY PROBLEM
`
`The existence in the physical system of multiple,
`redundant resources suggests the possibility of highly
`reliable operation-at
`least in the sense of continuing
`to provide (degraded) service when some fraction of
`the hardware is down. An explicit goal in the HYDRA
`design is to provide commensurate reliability in the
`software. Reliability may have two components:
`
`(1) Correctness: The major reason for unreliability
`in current software is that it is incorrect. How-
`ever,
`-the proposed design for the kernel is small
`enough that a "constructive programming"
`approach can be used effectively (Dijkstra)
`-the design suggests natural modular decompo-
`sition along the lines suggested by Parnas
`(Parnas 1972)
`-the coding is being done in a "systems imple-
`mentation language"
`(Bliss/ll)
`(Wulf, et
`al., 1970, 1971)
`-the
`protection mechanism itself absolutely
`guarantees that an erroneous or malicious
`program cannot destroy information to which
`it does not have legal access.
`Therefore the correctness of the kernel must
`be proven and its construction is proceeding in
`a highly stylized form design to facilitate this.
`
`* The notation X.n will be used to refer to the nth local name in
`procedure X; "+" is to be read "maps onto" or "is a reference to."
`
`C.mmp-A Multi-Mini-Processor
`
`773
`
`(2) Malfunction: Even if the software is correct i t
`is possible for the system to be unreliable, for
`example, as the result of misexecution of correct
`code by (perhaps intermittently) failing hard-
`ware. This problem is compounded by both the
`multiprocessor character of the system and the
`kernel design.
`
`Although a great deal of research has been done on
`hardware reliability, (for example in connection with
`computers for extended space mis