`
`A NOTE ON DISTRIBUTED COMPUTING
`
`Ignoring the difference between the performanceof local and remote invoca-
`tions can lead to designs whose implementations are virtually assured of having
`performance problemsbecausethe design requires a large amountof communica-
`tion between componentsthat are in different address spaces and on different
`machines. Ignoring the difference in the time it takes to make a remote object
`invocation and the time it takes to make a local object invocation is to ignore one
`of the major design areas of an application. A properly designed application will
`require determining, by understanding the application being designed, what
`objects can be made remote and what objects must be clustered together.
`The vision outlined earlier, however, has an answer to this objection. The
`answeris two-pronged. Thefirst prongis to rely on the steadily increasing speed
`of the underlying hardware to makethe difference in latency irrelevant. This,it is
`often argued, is what has happenedto efficiency concerns having to do with every-
`thing from high level languagesto virtual memory. Designing at the cutting edge
`has always required that the hardware catch up before the design is efficient
`enough for the real world. Arguments from efficiency seem to have gone out ot
`style in software engineering, since in the past such concerns have always been
`answered by speedincreases in the underlying hardware.
`The secondprongofthe reply is to admit to the need for tools that will allow
`one to see what the pattern of communication is between the objects that make up
`an application. Once suchtools are available,it will be a matter of tuning to bring
`objects that are in constant contact to the same address space, while moving those
`that are in relatively infrequent contact to wherever is most convenient. Since the
`vision allows all objects to communicate using the same underlying mechanism,
`such tuning will be possible by simply altering the implementation details (such
`as object location) of the relevant objects. However,
`it is important to get the
`application correctfirst, and after that one can worry aboutefficiency.
`Whether or notit will ever become possible to maskthe efficiency difference
`between a local object invocation and a distributed object invocation is not
`answerable a priori. Fully masking the distinction would require not only
`advancesin the technology underlying remote object invocation, but would also
`require changesto the general programming model used by developers.
`If the only difference betweenlocal anddistributed object invocations wasthe
`difference in the amountof time it took to makethecall, one could strive for a
`future in which the two kinds of calls would be conceptually indistinguishable.
`Whetherthe technology of distributed computing has moved far enough along to
`allow one to plan products based on such technology would be a matter of judge-
`ment, and rational people could disagree as to the wisdom of such an approach.
`However, the difference in latency between the two kindsof calls is only the
`most obvious difference. Indeed,this difference is not really the fundamental dif-
`ference between the two kindsofcalls, and that even if it were possible to develop
`
`
`
`316
`
`316
`
`
`
`314
`
`A NOTE ONDISTRIBUTED COMPUTING
`
`the technology of distributed calls to an extent that the difference in latency
`betweenthe twosorts of calls was minimal, it would be unwise to construct a pro-
`gramming paradigm that treated the twocalls as essentially similar. In fact, the
`difference in latency between local and remote calls, becauseit is so obvious, has
`been the only difference most see between the two, and has tended to mask the
`moreirreconcilable differences.
`
`A.4.2 Memory Access
`
`A more fundamental (butstill obvious) difference between local and remote com-
`puting concernsthe access to memoryin the two cases—specifically in the use of
`pointers. Simply put, pointers in a local address space are not valid in another
`(remote) address space. The system can paperoverthis difference, but for such an
`approach to be successful, the transparency must be complete. Two choicesexist:
`either all memory access mustbe controlled by the underlying system, or the pro-
`grammer must be awareofthe different types of access—local and remote. There
`is no inbetween.
`If the desire is to completely unify the programming model—to make remote
`accesses behave as if they were in fact local—the underlying mechanism must
`totally control all memory access. Providing distributed shared memory is one
`way of completely relieving the programmer from worrying about remote mem-
`ory access(orthe difference between local and remote). Using the object-oriented
`paradigm to the fullest, and requiring the programmerto build an application with
`“objects all the way down,”(that is, only object referencesor values are passed as
`method arguments) is another way to eliminate the boundary between local and
`remote computing. The layer underneath can exploit this approach by marshalling
`and unmarshalling method arguments and return values for intra-address space
`transmission.
`But adding a layer that allows the replacementofall pointers to objects with
`object references only permits the developer to adopt a unified model of object
`interaction. Such a unified model cannot be enforced unless one also removes the
`ability to get address-space-relative pointers from the language used bythe devel-
`oper. Such an approacherects a barrier to programmers who wantto start writing
`distributed applications, in that it requires that those programmers learn a new
`style of programming which does not use address-space-relative pointers.
`In
`requiring that programmers learn such a language, moreover, one gives up the
`complete transparency between local and distributed computing./4!
`Even if one wereto provide a languagethat did not allow obtaining address-
`space-relative pointers to objects (or returned an object reference whenever such a
`pointer was requested), one would need to provide an equivalent way of making
`
`
`
`317
`
`317
`
`
`
`ir)ooa4
`
`s==
`
`e
`=
`
`U03J0N
`
`
`
`(2)
`
`= =S
`
`A NOTE ON DISTRIBUTED COMPUTING
`
`315
`
`cross-address space reference to entities other than objects. Most programmers
`use pointers as references for manydifferent kindsofentities. These pointers must
`either be replaced with something that can be used in cross-address space calls or
`the programmerwill need to be aware ofthe difference between such calls (which
`will either not allow pointers to such entities, or do something special with those
`pointers) and local calls. Again, while this could be done, it does violate the doc-
`trine of complete unity between local and remote calls. Because of memory access
`constraints, the two haveto differ.
`The dangerlies in promoting the myth that “remote access and local access
`are exactly the same” and notenforcing the myth. An underlying mechanism that
`does not unify all memory accesses while still promoting this myth is both mis-
`leading andproneto error. Programmers buying into the myth may believe that
`they do not have to change the waythey think about programming. The program-
`meris therefore quite likely to make the mistake of using a pointer in the wrong
`context, producing incorrect results. “Remote is just like local,” such program-
`mers think, “so we have just one unified programming model.” Seemingly, pro-
`grammers need not change their style of programming.
`In an incomplete
`implementation of the underlying mechanism, or onethat allows an implementa-
`tion language that in turn allows direct access to local memory, the system does
`not take care of all memory accesses, and errors are bound to occur. These errors
`occur because the programmer is not aware of the difference between local and
`remote access and whatis actually happening “underthe covers.”
`The alternative is to explain the difference between local and remote access,
`making the programmer aware that remote address space access is very different
`from local access. Even if some of the pain is taken away by using an interface
`definition language like that specified in [1] and having it generate an intelligent
`language mapping for operation invocation on distributed objects, the program-
`mer awareofthe difference will not make the mistake of using pointers for cross-
`address space access. The programmerwill know itis incorrect. By not masking
`the difference, the programmeris able to learn when to use one method of access
`and whento use the other.
`Just as with latency,it is logically possible that the difference betweenlocal
`and remote memory access could be completely papered over and a single model
`of both presented to the programmer. When weturn to the problemsintroduced to
`distributed computing by partial failure and concurrency, however,it is not clear
`that such a unification is even conceptually possible.
`
`
`
`318
`
`318
`
`
`
`rr
`
`316
`
`A NOTE ONDISTRIBUTED COMPUTING
`
`A.5 Partial Failure and Concurrency
`
`While unlikely, it is at least logically possible that the differences in latency
`and memoryaccess between local computing anddistributed computing could be
`masked.It is not clear that such a masking could be done in such a waythat the
`local computing paradigm could beused to producedistributed applications,butit
`mightstill be possible to allow some new programming technique to be used for
`both activities. Such a masking does not even seem to be logically possible, how-
`ever,in the case ofpartial failure and concurrency. These aspects appear to be dif-
`ferentin kind in the caseofdistributed and local computing.”
`Partial failure is a central reality of distributed computing. Both the local and
`the distributed world contain components that are subject to periodic failure. In
`the case of local computing, such failures are either total, affecting all of the enti-
`ties that are working together in an application, or detectable by some central
`resourceallocator (such as the operating system on the local machine).
`This is not the case in distributed computing, where one component(machine,
`networklink) can fail while the others continue. Not only is the failure of the dis-
`tributed components independent, but there is no common agent that is able to
`determine what componenthas failed and inform the other components of that
`failure, no global state that can be examined that allows determination of exactly
`whaterror has occurred. In a distributed system, the failure of a network link is
`indistinguishable from the failure of a processor on the otherside ofthat link.
`These sorts of failures are not the same as mere exception raising or the
`inability to complete a task, which can occurin the case oflocal computing. This
`type of failure is caused when a machinecrashes during the execution of an object
`invocation or a network link goes down, occurrencesthat cause the target object to
`simply disappearrather than return controlto the caller. A central problem in dis-
`tributed computingis insuring that the state of the whole system is consistent after
`sucha failure; this is a problem that simply does not occurin local computing,
`Thereality of partial failure has a profound effect on how one designsinter-
`faces and on the semantics of the operationsin an interface, Partial failure requires
`that programsdeal with indeterminacy. Whena local componentfails, it is possi-
`ble to knowthe state of the system that caused the failure and the state of the sys-
`tem after the failure. No such determination can be made in the case of a
`distributed system. Instead, the interfaces that are used for the communication
`must be designed in such a waythatit is possible for the objects to react in a con-
`sistent way to possiblepartial failures,
`
` ?
`
`In fact, authors such as Schroeder"?! and Hadzilacos and Toueg!'9)take partial failure and
`concurrency to be the defining problemsof distributed computing.
`
`| 3
`
`19
`
`319
`
`
`
`317
`
`A NOTEON DISTRIBUTED COMPUTING
`
`Being robust in the face of partial failure requires some expression at the
`interface level. Merely improving the implementation of one componentis not
`sufficient. The interfaces that connect the components must be able to state when-
`ever possible the cause of failure, and there must be interfaces that allow recon-
`struction of a reasonable state when failure occurs and the cause cannot be
`determined.
`If an object is co-resident in an address space with its caller, partial failure is
`not possible. A function may not complete normally, but it always completes.
`There is no indeterminism about how muchof the computation completed. Partial
`completion can occur only as a result of circumstances that will cause the other
`components tofail.
`The addition of partial failure as a possibility in the case of distributed com-
`puting does not mean that a single object model cannot be used for both distrib-
`uted computing and local computing. The question is not “can you make remote
`method invocation look like local method invocation?” but rather “what is the
`price of making remote method invocation identical to local method invocation?”
`One of two paths must be chosenif one is going to have a unified model.
`Thefirst path is to treat all objects as if they were local and designall inter-
`faces as if the objects calling them, and being called by them, were local. The
`result of choosing this path is that the resulting model, when used to producedis-
`tributed systems, is essentially indeterministic in the face of partial failure and
`consequently fragile and non-robust. This path essentially requires ignoring the
`extra failure modesof distributed computing. Since one can’t get rid of those fail-
`ures, the price of adopting the modelis to require that such failures are unhandled
`and catastrophic.
`The other path is to design all interfaces as if they were remote. That is, the
`semantics and operationsareall designed to be deterministic in the face of failure,
`both total and partial. However,
`this introduces unnecessary guarantees and
`semantics for objects that are never intended to be used remotely. Like the
`approach to memoryaccess that attempts to require thatall access is through sys-
`tem-defined references instead of pointers, this approach mustalso either rely on
`the discipline of the programmers using the system or change the implementation
`languagesothatall of the forms of distributed indeterminacy are forced to be
`dealt with on all object invocations.
`This approach would also defeat the overall purpose of unifying the object
`models. The real reason for attempting such a unification is to make distributed
`computing morelike local computing and thus make distributed computingeasier.
`This second approach to unifying the models makes local computing as complex
`as distributed computing. Rather than encouraging the production of distributed
`applications, such a model will discourage its own adoption by makingall object-
`based computing moredifficult.
`
`
`
`320
`
`320
`
`
`
`| 3
`
`A NOTE ON DISTRIBUTED COMPUTING
`
`18
`
`Similar arguments hold for concurrency. Distributed objects by their nature
`must handle concurrent method invocations. The same dichotomyapplies if one
`insists on a unified programming model. Eitherall objects must bear the weight of
`concurrency semantics, or all objects must ignore the problem and hope for the
`best when distributed. Again, this is an interface issue and not solely an imple-
`mentation issue, since dealing with concurrency can take place only by passing
`information from one object to another through the agency ofthe interface. So
`either the overall programming model must ignore significant modes offailure,
`resulting in a fragile system; or the overall programming model must assume a
`worst-case complexity model for all objects within a program, makingthe produc-
`tion of any program,distributed or not, moredifficult.
`One might argue that a multi-threaded application needs to deal with these
`same issues. However, there is a subtle difference. In a multi-threaded application,
`there is no real source of indeterminacy of invocations of operations. The applica-
`tion programmerhas complete control over invocation order when desired. A dis-
`tributed system by its nature introducestruly asynchronousoperation invocations.
`Further, a non-distributed system, even when multi-threaded, is layered on top of
`a single operating system that can aid the communication between objects and can
`be used to determine and aid in synchronization and in the recovery of failure. A
`distributed system, on the other hand, has nosingle point of resourceallocation,
`synchronization, or failure recovery, and thusis conceptually very different.
`
`A.6 The Myth of “Quality of Service”
`
`Onecould take the position that the way an object deals with latency, memory
`access, partial failure, and concurrency control is really an aspect of the imple-
`mentation of that object, and is best described as part of the “quality of service”
`provided by that implementation. Different implementations of an interface may
`provide different levels of reliability, scalability, or performance. If one wants to
`build a more reliable system, one merely needs to choose morereliable implemen-
`tations of the interfaces making up the system.
`On the surface, this seems quite reasonable. If I want a more robust system, I
`go to my catalog of component vendors. I quiz them abouttheir test methods. I see
`if they have ISO9000certification, and I buy my components from the oneI trust
`the most. The componentsall comply with the defined interfaces, so I can plug
`them right in; my system is robust and reliable, and I’m happy.
`Let us imaginethat I build an application that uses the (mythical) queueinter-
`face to enqueue work for some component. Myapplication dutifully enqueues
`records that represent work to be done. Another application dutifully dequeues
`them and performs the work. After a while, I notice that my application crashes
`
`
`
`321
`
`321
`
`
`
`
`
`A NOTE ON DISTRIBUTED COMPUTING
`
`dueto time-outs. I find this extremely annoying, but realize that it’s my fault. My
`application just isn’t robust enough. It gives up too easily on a time-out. So I
`change myapplication to retry the operation untilit succeeds. Now I’m happy.I
`almost never see a time-out. Unfortunately, I now have another problem. Some of
`the requests seem to get processed two,three, four, or more times. How can this
`be? The componentI bought which implements the queue has allegedly beenrig-
`orously tested. It shouldn’t be doing this. I’m angry. I call the vendor and yell at
`him. After much fingerpointing and research, the culprit is found. The problem
`turns out to be the way I’m using the queue. Because of my handling of partial
`failures (which in my naivete, I had thought to betotal), I have been enqueuing
`work requests multiple times.
`Well, I yell at the vendor that it is still their fault. Their queue should be
`detecting the duplicate entry and removingit. I’m not going to continueusingthis
`software unlessthis is fixed. But, since the entities being enqueued are just values,
`there is no way to do duplicate elimination. The only wayto fix this is to change
`the protocol to add request IDs. Butsince this is a standardized interface, there is
`no wayto dothis.
`The moralofthis tale is that robustness is not simply a function of the imple-
`mentations of the interfaces that make up the system. While robustness ofthe
`individual components has someeffect on the robustness of the overall systems, it
`is not the sole factor determining system robustness. Many aspects of robustness
`can bereflected only at the protocol/interface level.
`Similar situations can be found throughoutthe standard set of interfaces. Sup-
`pose I wantto reliably remove a namefrom a context. I would be tempted to write
`codethat looks like:
`
`while (true) {
`try f
`context->remove (name) ;
`break;
`
`} c
`
`atch (NotFoundInContext)
`break;
`
`{
`
`}c
`
`}
`
`atch (NetworkServerFaliure) {
`continue;
`
`}
`Thatis, I keep trying the operation until it succeeds (or until I crash). The problem
`is that my connection to the name server may have gone down,but anotherclient’s
`may have stayed up. I may have,in fact, successfully removed the name but not
`
`
`
`322
`
`322
`
`
`
`
`
`aa
`
`320
`
`A NOTE ON DISTRIBUTED COMPUTING
`
`discovered it because of a network disconnection. The other client then adds the
`same name, which I then remove. Unless the naming interface includes an opera-
`tion to lock a naming context, there is no waythat I can make this operation com-
`pletely robust. Again, we see that robustness/reliability needs to be expressed at
`the interface level. In the design of any operation, the question has to be asked:
`What happensif the client chooses to repeat this operation with the exact same
`parameters as previously? What mechanisms are needed to ensure that they get
`the desired semantics? These are things that can be expressed only at the interface
`level. These are issues that can’t be answered by supplying a “more robust imple-
`mentation” because the lack of robustness is inherent in the interface and not
`something that can be changedbyaltering the implementation.
`Similar arguments can be made about performance. Suppose an interface
`describes an object which maintains sets of other objects. A defining property of
`sets is that there are no duplicates. Thus, the implementation ofthis object needs
`to do duplicate elimination. If the interfaces in the system do not provide a way of
`testing equality of reference, the objects in the set must be queried to determine
`equality. Thus, duplicate elimination can be done only by interacting with the
`objects in the set. It doesn’t matter how fast the objects in the set implement the
`equality operation. The overall performance of eliminating duplicates is going to
`be governed by the latency in communicating over the slowest communications
`link involved. There is no changein the set implementations that can overcome
`this. An interface design issue has put an upper bound on the performanceofthis
`operation.
`
`A.7 Lessons From NFS
`
`We do not need to look far to see the consequences of ignoring the distinction
`between local and distributed computingat the interface level. NES®, Sun’s dis-
`tributed computingfile system!'+5Iis an example ofa non-distributed application
`programerinterface (API) (open, read, write, close, etc.) re-implementedin a dis-
`tributed way.
`Before NFS and other networkfile systems, an error status returned from one
`of these calls indicated something rare: a full disk, or a catastrophe such as a disk
`crash. Mostfailures simply crashed the application along with the file system.
`Further, these errors generally reflected a situation that waseither catastrophic for
`the program receiving the erroror onethat the user running the program could do
`something about.
`NFS opened the doorto partial failure within a file system. It has essentially
`two modes for dealing with an inaccessible file server: soft mounting and hard
`mounting. But since the designers of NFS were unwilling (for easily understand-
`
`
`
`323
`
`323
`
`
`
`
`
`oO
`a)
`
`FQSCTEaTas)
`
`&5
`
`A NOTE ON DISTRIBUTED COMPUTING
`
`321
`
`able reasons) to change theinterface tothefile system to reflect the new, distrib-
`uted nature of file access, neither option is particularly robust.
`Soft mounts expose network orserverfailure to the client program. Read and
`write operations return a failure status much more often than in the single-system
`case, and programs written with no allowancefor these failures can easily corrupt
`the files used by the program. In the early days of NFS, system administrators tried
`to tune various parameters (time-out length, number of retries) to avoid these
`problems. These efforts failed. Today, soft mounts are seldom used, and when
`they are used, their use is generally restricted to read-only file systems or special
`applications.
`Hard mounts mean that the application hangs until the server comes back up.
`This generally prevents a client program from seeingpartial failure, but it leads to
`a malady familiar to users of workstation networks: one server crashes, and many
`workstations—even those apparently having nothing to do with that server—
`freeze. Figuring out the chain of causality is very difficult, and even when the
`cause of the failure can be determined, the individual user can rarely do anything
`about it but wait. This kind of brittleness can be reduced only with strong policies
`and network administration aimed at reducing interdependencies. Nonetheless,
`hard mounts are now almost universal.
`Note that because the NFS protocolis stateless, it assumes clients contain no
`state of interest with respect to the protocol; in other words, the server doesn’t
`care what happensto the client. NFSis also a “pure” client-server protocol, which
`meansthatfailure can be limited to three parties: the client, the server, or the net-
`work. This combination of features means that failure modes are simpler than in
`the more general case of peer-to-peer distributed object-oriented applications
`where no suchlimitation on shared state can be made and whereservers are them-
`selves clients of other servers. Such peer-to-peer distributed applications can and
`will fail in far more intricate ways than are currently possible with NFS.
`The limitations on the reliability and robustness of NFS have nothing to do
`with the implementation of the parts of that system. There is no “quality of ser-
`vice” that can be improved to eliminate the need for hard mounting NFS volumes.
`The problem can be traced to the interface upon which NFSis built, an interface
`that was designed for non-distributed computing where partial failure was not
`possible. Thereliability of NFS cannot be changed without a changeto thatinter-
`face, a change that will reflect the distributed nature of the application.
`This is not to say that NFS hasnot been successful. In fact, NFS is arguably the
`most successful distributed application that has been produced. Butthe limitations
`on the robustness have set a limitation on the scalability of NFS. Because of the
`intrinsic unreliability of the NFS protocol, use of NFS is limited to fairly small
`numbers of machines, geographically co-located and centrally administered. The
`way NEShas dealt with partial failure has been to informally require a centralized
`
`
`
`324
`
`324
`
`
`
`
`a|
`
`322
`
`A NOTE ON DISTRIBUTED COMPUTING
`
`resource manager (a system administrator) who can detect system failure, initiate
`resource reclamation and insure system consistency. But byintroducing this cen-
`tral resource manager, one could argue that NFS is no longer a genuinely distrib-
`uted application.
`
`A.8 Taking the Difference Seriously
`
`Differences in latency, memory access, partial failure, and concurrency make
`merging of the computational models of local and distributed computing both
`unwise to attempt and unable to succeed. Merging the models by making local
`computing follow the model of distributed computing would require major
`changes in implementation languages (or in how those languages are used) and
`makelocal computing far more complex than is otherwise necessary. Merging the
`models by attempting to make distributed computing follow the model oflocal
`computing requires ignoring the different failure modes andbasic indeterminacy
`inherentin distributed computing, leading to systems that are unreliable and inca-
`pable of scaling beyond small groups of machines that are geographically co-
`located and centrally administered.
`there are irreconcilable differences
`that
`A better approach is to accept
`between local and distributed computing, and to be conscious of those differences
`at all stages of the design and implementation of distributed applications. Rather
`than trying to merge local and remote objects, engineers need to be constantly
`remindedof the differences between the two, and know whenit is appropriate to
`use each kind of object.
`Accepting the fundamental difference between local and remote objects does
`not meanthat either sort of object will require its interface to be defined differ-
`ently. An interface definition language such as IDL"! can still be used to specify
`the set of interfaces that define objects. However, an additional part of the defini-
`tion of a class of objects will be the specification of whether those objects are
`meant to be used locally or remotely. This decision will need to consider what the
`anticipated message frequencyis for the object, and whetherclients of the object
`can accept the indeterminacy implied by remote access. The decision will be
`reflected in the interface to the object indirectly, in that the interface for objects
`that are meant to be accessed remotely will contain operationsthat allowreliabil-
`ity in the face ofpartial failure.
`It is entirely possible that a given object will often need to be accessed by
`some objects in ways that cannotallow indeterminacy, and byother objects rela-
`tively rarely and in a way that does allow indeterminacy. Such cases should be
`split into two objects (which might share an implementation) with one having an
`
`
`
`325
`
`325
`
`
`
`
`
`A NOTE ON DISTRIBUTED COMPUTING
`
`interface that is best for local access and the other having an interface that is best
`for remote access.
`A compiler for the interface definition language used to specify classes of
`objects will need to alter its output based on whetherthe class definition being
`compiled is for a class to be used locally or a class being used remotely. For inter-
`faces meant for distributed objects, the code produced might be very much like
`that generated by RPC stub compilers today. Code for a local interface, however,
`could be much simpler, probably requiring little more than a class definition in the
`target language.
`While writing code, engineers will have to know whether they are sending
`messages to local or remote objects, and access those objects differently. While
`this might seem to add to the programmingdifficulty, it will in fact aid the pro-
`grammer by providing a framework under which he or she can learn what to
`expect from the different kinds of calls. To program completely in the local envi-
`ronment, according to this model, will not require any changes from the program-
`mer’s point of view. The discipline of defining classes of objects using an
`interface definition language will insure the desired separation of interface from
`implementation, but the actual process of implementing an interface will be no
`different than what is done today in an object-oriented language.
`Programming a distributed application will require the use of different tech-
`niques than those used for non-distributed applications. Programming a distrib-
`uted application will require thinking about the problem in a different way than
`before it was thought about when the solution was a non-distributed application.
`But that
`is only to be expected. Distributed objects are different from local
`objects, and keeping that difference visible will keep the programmer from forget-
`ting the difference and making mistakes. Knowing that an objectis outside ofthe
`local address space, and perhaps on a different machine, will remind the program-
`mer that he or she needs to program in a waythat reflects the kinds of failures,
`indeterminacy, and concurrency constraints inherent in the use of such objects.
`Making the difference visible will aid in making the difference part of the design
`of the system.
`Accepting that local and distributed computing are different in an irreconcil-
`able way will also allow an organization to allocate its research and engineering
`resources more wisely. Rather than using those resources in attempts to paper over
`the differences between the two kinds of computing, resources can be directed at
`improving the performance andreliability of each.
`One consequenceof the view espousedhereis that it is a mistake to attempt to
`construct a system that is “objects all the way down”if one understands the goal
`as a distributed system constructed of the same kind of objectsall the way down.
`There will be a line where the object model changes; on oneside of the line will
`be distributed objects, and on the other side of the line there will (perhaps) be
`
`
`
`326
`
`326
`
`
`
`ay
`
`324
`
`A NOTE ON DISTRIBUTED COMPUTING
`
`local objects. On eitherside ofthe line, entities on the other side of the line will be
`opaque; thus one distributed object will not know (orcare) if the implementation
`of another distributed object with which it communicates is made up of objects or
`is implemented in some other way. Objects on different sides of the line will differ
`in kind and notjust in degree; in particular, the objects will differ in the kinds of
`failure modes with which they mustdeal.
`
`A.9 A Middle Ground
`
`As noted in Section A.2, the distinction between local and dist