throbber
Poseidon House
`Castle Park
`Cambridge CB3 0RD
`United Kingdom
`
`TELEPHONE:
`INTERNATIONAL:
`FAX:
`E-MAIL:
`
`Cambridge (01223) 515010
`+44 1223 515010
`+44 1223 359779
`apm@ansa.co.uk
`
`ANSA Phase III
`
`Monitoring in Distributed Systems
`
`Yigal Hoffner
`
`Abstract
`
`A general model of management is introduced and used in the development of a model of the
`management of monitoring for object-based federated distributed systems. This model is
`subsequently used to show how monitoring and its management can be implemented in such
`systems.
`
`The information and structures necessary for conducting a monitoring session with multiple
`objects are presented. The problem of managing a monitoring session, where the set of objects
`under observation changes dynamically, is addressed. Finally, the problem of management
`across federation boundaries is discussed.
`
`APM.1008.01
`
`Approved
`Architecture Report
`
`25th October 1994
`
`Distribution:
`Supersedes:
`Superseded by:
`
`Copyright ª 1994 Architecture Projects Management Limited
`The copyright is held on behalf of the sponsors for the time being of the ANSA Workprogramme.
`
`HP_1021_0001
`
`

`

`HP_1021_0002
`
`HP_1021_0002
`
`

`

`Monitoring in Distributed Systems
`
`HP_1021_0003
`
`

`

`HP_1021_0004
`
`HP_1021_0004
`
`

`

`Monitoring in Distributed Systems
`
`Yigal Hoffner
`
`APM.1008.01
`
`25th October 1994
`
`HP_1021_0005
`
`

`

`The material in this Report has been developed as part of the ANSA Architec-
`ture for Open Distributed Systems. ANSA is a collaborative initiative, managed
`by Architecture Projects Management Limited on behalf of the companies
`sponsoring the ANSA Workprogramme.
`
`The ANSA initiative is open to all companies and organisations. Further infor-
`mation on the ANSA Workprogramme, the material in this report, and on other
`reports can be obtained from the address below.
`
`The authors acknowledge the help and assistance of their colleagues, in spon-
`soring companies and the ANSA team in Cambridge in the preparation of this
`report.
`
`Architecture Projects Management Limited
`
`Poseidon House
`Castle Park
`CAMBRIDGE
`CB3 0RD
`United Kingdom
`
`TELEPHONE UK
`INTERNATIONAL
`FAX
`E-MAIL
`
`(01223) 515010
`+44 1223 515010
`+44 1223 359779
`apm@ansa.co.uk
`
`Copyright ª
` 1994 Architecture Projects Management Limited
`The copyright is held on behalf of the sponsors for the time being of the ANSA
`Workprogramme.
`
`Architecture Projects Management Limited takes no responsibility for the con-
`sequences of errors or omissions in this Report, nor for any damages resulting
`from the application of the ideas expressed herein.
`
`HP_1021_0006
`
`

`

`Contents
`
`3
`3
`3
`3
`4
`
`5
`5
`5
`6
`6
`6
`6
`7
`
`9
`9
`9
`9
`10
`10
`11
`11
`11
`12
`14
`
`15
`15
`15
`
`19
`19
`19
`19
`20
`20
`
`23
`23
`23
`23
`24
`
`1
`1.1
`1.2
`1.3
`1.4
`
`2
`2.1
`2.2
`2.3
`2.3.1
`2.3.2
`2.3.3
`2.3.4
`
`3
`3.1
`3.2
`3.2.1
`3.2.2
`3.2.3
`3.2.4
`3.2.5
`3.2.6
`3.3
`3.4
`
`4
`4.1
`4.2
`
`5
`5.1
`5.2
`5.3
`5.4
`5.4.1
`
`6
`6.1
`6.2
`6.3
`6.4
`
`Introduction
`Abstract
`Audience, scope and purpose
`Context
`Overview
`
`Monitoring
`The purpose of monitoring
`Modelling and the level of monitoring
`The problems of monitoring
`Direct and indirect observations
`Complete and incomplete observations
`Presentation problems
`Monitoring and interference
`
`Distribution and monitoring
`Introduction
`Aspects of distribution
`Physical separation
`Concurrency
`Heterogeneity
`Federation
`Scaling
`Evolution
`Problems and reversed assumptions
`Conclusions about monitoring in a distributed system
`
`Approach to monitoring in distributed systems
`Introduction
`Monitoring and its management in object-based federated distributed
`systems
`
`A model of monitoring and its management
`Introduction
`The generic model of management
`Applying the model of management to monitoring
`Developing the model of management of monitoring
`The generic model
`
`The management of monitoring
`Introduction
`The monitoring process
`Areas of management
`Management of generation of monitoring events
`
`APM.1008.01
`
`Monitoring in Distributed Systems
`
`i
`
`HP_1021_0007
`
`

`

`Contents
`
`ANSA Phase III
`
`24
`25
`25
`26
`26
`
`27
`27
`27
`27
`28
`28
`28
`29
`29
`29
`29
`
`31
`31
`31
`31
`31
`31
`32
`32
`33
`33
`34
`34
`34
`35
`36
`
`37
`37
`37
`37
`38
`38
`39
`40
`40
`42
`42
`42
`42
`
`6.5
`6.6
`6.7
`6.8
`6.9
`
`7
`7.1
`7.2
`7.3
`7.3.1
`7.3.2
`7.4
`7.5
`7.5.1
`7.5.2
`7.6
`
`8
`8.1
`8.2
`8.2.1
`8.2.2
`8.2.3
`8.3
`8.4
`8.5
`8.5.1
`8.6
`8.7
`8.8
`8.8.1
`8.8.2
`
`9
`9.1
`9.1.1
`9.1.2
`9.2
`9.3
`9.4
`9.5
`9.5.1
`9.5.2
`9.5.3
`9.5.4
`9.5.5
`
`Management of distribution, collation and logging
`Management of processing
`Management of processing and presentation processes
`Management facilities
`Monitoring, management and system development epochs
`
`Monitoring and management in objects
`Introduction
`Model of monitoring facilities in an object
`Monitoring and management facilities
`Object query operations
`Object activation operations
`Model of monitoring facilities in a capsule
`Monitoring and management facilities
`Capsule query operations
`Capsule activation operations
`The granularity of monitoring
`
`Managing a monitoring session
`Introduction
`Monitoring a configuration of multiple objects
`Managing a dynamic monitoring session
`Scope of monitoring
`Views and information in a monitoring session
`Assumptions about the monitoring session
`Monitoring session management facilities
`Managing a monitoring session
`The phases of a monitoring session
`Setting up the monitoring session components
`Setting up of the initial scope of monitoring
`Managing an ongoing monitoring session
`Extending the scope of Monitoring
`Notifying the MMgr
`
`Monitoring across boundaries
`Introduction
`Framework for standardization
`Management domains and boundaries
`Integrating application and distributed infrastructure monitoring
`Integrating monitoring in a single domain
`Integrating monitoring across several domains
`Monitoring facilities standardization issues
`Basic set of events - a taxonomy
`Management of monitoring in objects
`Access to management and scope of monitoring
`The Monitoring manager and Monitoring collator
`Presentation issues
`
`ii
`
`Monitoring in Distributed Systems
`
`APM.1008.01
`
`HP_1021_0008
`
`

`

`1 Introduction
`
`1.1 Abstract
`
`A general model of management is introduced and used in the development of
`a model of the management of monitoring for object-based federated
`distributed systems. This model is subsequently used to show how monitoring
`and its management can be implemented in such systems.
`The information and structures necessary for conducting a monitoring session
`with multiple objects are presented. The problem of managing a monitoring
`session, where the set of objects under observation changes dynamically, is
`addressed. Finally, the problem of management across federation boundaries
`is discussed.
`
`1.2 Audience, scope and purpose
`
`This document develops a model of monitoring and its management for object-
`based federated distributed systems. It is addressed to designers of distributed
`systems.
`The approach described in this document requires that the underlying
`distributed systems infrastructure can represent managed entities as
`encapsulated objects and can transmit references to object interfaces as
`parameters of management operations. Such a capability is the basis of the
`ISO Basic Reference Model for Open Distributed Processing [X.900 92], the
`OMG Combined Object Request Broker Specification [OMG 91] and the ANSA
`architecture [AR.001 93].
`
`1.3 Context
`
`This document should be read in conjunction with [TR.39 93] and [TR.41 93]. The
`documents are related to each other as follows:
`• TR.39 explains the philosophy and general approach to management in
`object-based federated distributed systems
`• TR.41 explains the problems with visualizing distributed systems and
`discusses the requirements such a process poses to the monitoring
`infrastructure
`this document explains and develops a model of monitoring and its
`management. It uses the model presented in TR.39 in order to construct a
`model of the management of monitoring. The requirements which the
`visualization of distributed systems impose on the monitoring
`infrastructure, and which are outlined in TR.41 are also used.
`
`•
`
`APM.1008.01
`
`Monitoring in Distributed Systems
`
`3
`
`HP_1021_0009
`
`

`

`Introduction
`
`1.4 Overview
`
`ANSA Phase III
`
`Monitoring is the process of obtaining, collecting, and presenting the
`information required by an observer about the observed system [JOYCE 87],
`[DOMAINS 92], [MCDOWELL 89], [SAMANI 92], [SLOMAN 89a], [SLOMAN 89b],
`[LABARRE 91] and [WINTERBOTHAM 87].
`Monitoring is always carried out with a purpose in mind. The general aim is to
`obtain information in order to construct a model of system behaviour or to
`modify an existing model. The general activity of monitoring a system can be
`specialized to a particular purpose such as accounting, debugging or testing,
`among others. The specialization of monitoring to the different purposes
`determines the type and the way in which information collected.
`Distribution and more specifically, dealing with issues such as heterogeneity,
`autonomy, physical separation and concurrency, complicates the process of
`monitoring. The design and development of monitoring facilities needs to
`deal with these problems.
`The object-based approach to building distributed systems requires the
`designers to incorporate monitoring and management facilities in each object.
`This paper is an investigation of the facilities which should be included in each
`object if monitoring is to be viable in distributed systems.
`Monitoring often involves correlating concurrent events at multiple objects.
`Management structures are, therefore, necessary to maintain information on
`the objects participating in a monitoring session, and manage the monitoring
`facilities in each of them. In addition, appropriate structures for collecting
`monitoring information are required.
`In a distributed system, a monitoring session can evolve dynamically, as
`activities related to an application spread throughout the system. The
`management of a monitoring session must therefore be able to extend and
`contract the set of objects under observation. The distributed system
`infrastructure must allow access to management functions of an object given a
`reference to one of the object’s service interfaces.
`When monitoring across federation boundaries, differences in monitoring and
`management facilities must be accommodated either by prior agreement to
`provide common facilities, or by supplying the appropriate translators which
`allow interworking. There is also a need to provide channels and mechanisms
`for resolving policy conflicts across federation boundaries. The integration of
`monitoring facilities of different systems is an important part of the problem of
`monitoring across boundaries.
`
`4
`
`Monitoring in Distributed Systems
`
`APM.1008.01
`
`HP_1021_0010
`
`

`

`2 Monitoring
`
`2.1
`
`The purpose of monitoring
`
`Monitoring is carried out in order to obtain information about a system, and in
`general, monitoring is part of the process of management (Figure 2.1). Among
`the many activities which involve monitoring we find:
`•
`debugging
`•
`testing
`•
`accounting
`•
`performance evaluation
`•
`resource utilisation analysis
`•
`security
`•
`fault detection
`•
`teaching aid.
`Monitoring and its management are concerned with providing the
`necessary information in order to allow the construction of the required model
`of the observed system and its presentation. It is the purpose of monitoring
`which dictates what should be observed and also how the information is to be
`obtained.
`
`Figure 2.1: The relationship between management and monitoring
`
`System
`
`Controlling
`
`Monitoring
`
`Decision
`making
`
`2.2 Modelling and the level of monitoring
`
`The different purposes for which monitoring is carried out can be executed at
`different levels. Thus, for example, debugging a single object as opposed to
`debugging the interactions among multiple objects will require different
`events to be observed. A language debugger will require events to be
`generated at a smaller level of granularity than that which is aimed at
`debugging the interactions between objects.
`Some of the models constructed for the purposes listed in §2.1 will require a
`different model or models of the distributed system. The exact level of
`
`APM.1008.01
`
`Monitoring in Distributed Systems
`
`5
`
`HP_1021_0011
`
`

`

`Monitoring
`
`ANSA Phase III
`
`modelling will dictate the granularity of the events the observer wishes to
`monitor.
`
`2.3
`
`The problems of monitoring
`
`The following is an exposition of the problems encountered when monitoring
`centralized and distributed computer systems.
`
`2.3.1 Direct and indirect observations
`The behaviour of some systems can be directly observed, thereby making the
`process of monitoring relatively straight forward. In computer systems most
`events of interest cannot be observed directly without special facilities,
`thereby requiring the incorporation of a monitoring infrastructure in such
`systems. The monitoring infrastructure will also facilitate the management of
`monitoring.
`There may be several levels of indirection between the observed system and
`the observer. Indirection may:
`• make changes to the observations which are not related to the behaviour
`of the observed system, for example, change the order of the messages
`sent to the observer
`affect the reliability of the observations
`introduce distance between observer and observed system and hence lack
`of trust
`directly influence the behaviour of the observed system (interference).
`•
`Distribution complicates the process of monitoring, introducing additional
`levels of indirection and subsequently additional problems. These are
`discussed in Chapter 3.
`
`•
`•
`
`2.3.2 Complete and incomplete observations
`Completeness and incompleteness refer to whether the information necessary
`in order to construct a particular model of an observed system is available or
`not.
`It could be argued that any observation of a system only reveals part of the
`system. This is not a problem when the observer is constructing a particular
`model of the system and the observation fits this model. However,
`incompleteness can cause problems when it is not intended or not catered for
`[TR.41 93]. For example:
`• when information cannot be obtained thereby hiding certain aspects of the
`system from the observer
`• when hidden information makes some of the available information non-
`interpretable by creating the wrong context for its interpretation.
`In reality, the two problems may stem from the same cause.
`
`2.3.3 Presentation problems
`In many cases it is necessary to modify the information from the observed
`events in a system in order to overcome the following problems (Figure 2.2):
`
`6
`
`Monitoring in Distributed Systems
`
`APM.1008.01
`
`HP_1021_0012
`
`

`

`ANSA Phase III
`
`Monitoring
`
`•
`
`•
`
`•
`
`•
`
`observed events appear in a form which is not amenable for immediate
`use by the observer
`observed events occur at a rate which cannot be easily used by the
`observer
`the volume of observed events may be such that it overwhelms the
`observer
`in a system in which has no central point of observation, events of interest
`may occur at different parts of the system. Structures and processes which
`collect and order the information from the observed events are therefore
`necessary.
`
`2.3.4 Monitoring and interference
`Every system is affected by being monitored. The extent of the influence may
`or may not be negligible from the point of view of the user(s) or the observer(s)
`of the system. There is a relation between the flexibility of the monitoring
`facilities, the cost of implementation, and the extent to which they interfere
`with the behaviour of the system.
`The most general requirement from monitoring, which is independent of the
`purpose for which it is introduced, is that although the sequence of system
`events may change as a result of the interference caused by monitoring, it
`must not result in an illegal sequence of events taking place.
`
`Figure 2.2: Monitoring: transforming the information from the system events
`
`System
`
`System
`Event
`
`Monitoring
`Event
`
`Observer
`
`Monitor
`
`APM.1008.01
`
`Monitoring in Distributed Systems
`
`7
`
`HP_1021_0013
`
`

`

`Monitoring
`
`ANSA Phase III
`
`8
`
`Monitoring in Distributed Systems
`
`APM.1008.01
`
`HP_1021_0014
`
`

`

`3 Distribution and monitoring
`
`3.1
`
`Introduction
`
`This chapter discusses aspects of distribution which affect monitoring:
`physical separation, concurrency, heterogeneity, federation, scaling and
`evolution. A summary of the assumptions which are no longer valid when
`monitoring distributed systems, as opposed to centralised systems, is
`presented. Some general comments are made about the philosophy of
`management in an object-oriented distributed system. The chapter also
`identifies the three major problem areas of providing monitoring in distributed
`systems: management of monitoring, reconstruction of the causal flow of
`events, and presentation of monitoring information.
`
`3.2 Aspects of distribution
`
`The following sections discuss the aspects of distribution which have an effect
`on monitoring: physical separation, concurrency, heterogeneity, federation,
`scaling and evolution [WARNE 91].
`
`3.2.1 Physical separation
`In a distributed system the physical separation of objects is unavoidable. In
`addition, communication delays among objects are usually variable and
`unpredictable. As a result there is no single point of reference from which
`events in the entire system can be directly observed. In order to obtain a global
`view of the system it is necessary to collect information on local events from
`several locations, from which a reconstruction of the flow of global events can
`be made. For example, to determine whether a certain event at one location is
`causally related to another event at some other location.
`There are situations in which it is not possible to monitor events in certain
`parts of the system. This may be the result of the absence of monitoring
`facilities, or policy decisions imposed on an object. There are two additional
`complications in distributed systems:
`•
`failures can occur during communication
`•
`services may partially fail.
`Such failures may affect not only the activities being monitored, but also the
`monitoring of these activities, resulting in incomplete information.
`This complicates the reconstruction of the flow of events in the system, and
`results in an incomplete picture of the system. This problem is addressed in
`more detail in [TR.41 93].
`Distributed systems are characterized by the possibility of partial failures.
`Partial failures may lead to situations where some but not all of the managed
`objects in a system can be accessed. Moreover, some of the management
`infrastructure itself may fail. Fault tolerant techniques may therefore have to
`
`APM.1008.01
`
`Monitoring in Distributed Systems
`
`9
`
`HP_1021_0015
`
`

`

`Distribution and monitoring
`
`ANSA Phase III
`
`be applied to the management facilities themselves in order to make them
`more resilient to failure.
`The physical separation of systems together with the variable communication
`delays also means that there is no single point of control in a distributed
`system. This together with the absence of a single point of observation means
`that checkpoints, tracing, breakpoints and single stepping of a distributed
`application are difficult, if not impossible without changing the nature of the
`system.
`Figure 2.1 in Chapter 2 shows the relationship between a system, its monitor
`and controller, and the decision making process. If the system is distributed,
`the absence of a single point of control and the absence of a single point of
`observation implies that the monitor and controller must be distributed as
`well. Furthermore, in some systems the decision making process may either be
`distributed and/or have to be carried out in the face of incomplete information
`
`3.2.2 Concurrency
`Distributed systems will support multiple objects and activities. Bindings
`between objects will be set up and discarded, and objects will be able to invoke
`other objects asynchronously through these bindings. Furthermore, objects
`will be created and destroyed as the need arises. The dynamic initiation and
`termination of activities will lead to situations where the activities stemming
`from an application may spread throughout the system. The extent of the
`initiated activities may not be known in advance.
`From the point of view of monitoring this creates several difficulties. In order
`to gain sufficient understanding of the flow of events in a system it may not be
`enough to monitor a single object or simply its interactions with other objects.
`In fact we may wish to gain information on how activities spread in a system.
`Thus we may wish to:
`•
`fully activate monitoring of objects with which a monitored object
`interacts
`follow the chain of activity as it moves from one object to another.
`•
`As different combinations of these strategies may occasionally be required, the
`management of the monitoring activities in such circumstances will be
`difficult unless extremely flexible management structures can be provided.
`Together with different monitoring activation strategies, additional event
`information to allow the observer to follow activities throughout the system is
`necessary.
`
`3.2.3 Heterogeneity
`Large scale distributed systems inevitably include some diversity in their
`hardware, operating systems and their distributed system infrastructure. It is
`reasonable to assume, therefore, that this diversity will be reflected in the
`implementations of monitoring facilities. Distribution does not only refer to
`the run-time physical separation of components, but also to the possibility of a
`distributed development environment. In such a case it is possible to have
`different implementations of monitoring which do not conform to one another.1
`
`1. This may happen between heterogeneous systems but may also happen in a
`homogeneous environment.
`
`10
`
`Monitoring in Distributed Systems
`
`APM.1008.01
`
`HP_1021_0016
`
`

`

`ANSA Phase III
`
`Distribution and monitoring
`
`3.2.4
`
`In order to make possible monitoring across heterogeneous systems, it is
`necessary to reach agreement on monitoring conformance issues. These are
`discussed in Chapter 9.
`Standard management facilities cannot be assumed across domain
`boundaries. Different monitoring and control facilities may exist in different
`management domains.
`The problem of the integration of management infrastructures where different
`monitoring and control facilities may exist can be overcome through
`agreement on facilities or by the incorporation of facilities which allow
`dynamic integration of the different local management facilities
`
`Federation
`The existence of centralised ownership and universal and technical control in
`large scale distributed systems cannot be assumed, and separate sources of
`authority will inevitably reside side by side. In such systems a “federated”
`style of interworking will be necessary in which no participant is in control of
`the others. Each system controls its own services locally according to its
`policies. Different monitoring policies must be anticipated within federated
`systems and problems will arise when attempting to monitor across federation
`boundaries between systems whose monitoring policies clash. Cooperation
`between systems requires the parties responsible for them to negotiate the use
`of services either prior to the request for use of service or as a result of such a
`request.
`Examples of possible areas where negotiation is needed are:
`• where the authority allowed to request monitoring may differ
`• where the collation and logging strategies may be different causing, for
`example, security compromise or unacceptable resource usage
`• where granting access to monitoring management may be related in
`different systems to different conditions, e.g. system load, number of
`users, time of day, etc.
`
`3.2.5 Scaling
`As discussed in the section on concurrency, activities in a distributed system
`can spread and encompass large parts of the system. In cases where
`monitoring is expected to report on such activities, it is important to note that
`the monitoring activity itself will have to spread, thus consuming increasing
`storage, processing and communication resources. It is therefore essential that
`(the distribution of) monitoring itself scales well.
`The requirement for scaling needs monitoring structures which can
`accommodate distribution, system evolution, and growth of the activity in the
`face of resource constraints and performance requirements. Both management
`and collation structures must be designed with scaling in mind; these issues
`are dealt with in Chapter 7 and Chapter 8.
`
`3.2.6 Evolution
`Distributed systems will evolve over time, possibly in an inconsistent manner.
`If monitoring procedures change over time, there may be clashes between
`monitoring standards embedded in new components and monitoring
`standards in existing components. The problems arising in evolving systems
`
`APM.1008.01
`
`Monitoring in Distributed Systems
`
`11
`
`HP_1021_0017
`
`

`

`Distribution and monitoring
`
`ANSA Phase III
`
`are often similar in nature to those arising in heterogeneous and federated
`systems.
`
`3.3 Problems and reversed assumptions
`
`Certain problems associated with monitoring in a centralized system are
`exacerbated when dealing with distributed systems. However, problems also
`arise because of the reversal of many of the implicit assumptions made when
`monitoring centralized systems:
`• no central point of control: not being able to directly control the entire
`system from any single point requires extensions to sequential techniques
`involving monitoring, which in a centralized system are based on the
`existence of a single thread of control. Examples of such techniques are
`break-points, single stepping and checkpoints
`• no central point of observation: not being able to directly observe the
`system in its entirety from a single point of observation requires the
`collection of locally observed events in order to construct global views.
`However, this is more complicated than simply collecting the monitoring
`information. This is due to the fact that coupled with non-deterministic
`communication delays, collation of monitoring information cannot be
`based on the assumption that the order in which monitoring information
`is collected is related to the order in which they occurred. Furthermore,
`events of interest may consist of sequences of events which occur at
`different points of observation, thus requiring sequence recognition
`facilities
`• no central source of monitoring information: a frequent implicit
`assumption in a centralized systems is that the source of the monitoring
`information is a single source, and that error and monitoring messages
`will be sent to directly to the user's terminal or to a local file. Neither of
`these assumptions holds in a distributed system. Collation strategies are
`necessary to cater for multiple sources and destinations
`• no central point of decision making: the process of making decisions
`in a distributed system may itself be distributed. This may also be the
`case with the management of monitoring resulting in more than one
`manager in a monitoring session or an object participating in more than
`one monitoring session
`incomplete observability: in some cases it is not possible to observe
`certain parts of the system at all or only partially, resulting in incomplete
`information about events in that part of the system
`• non-determinism: distributed, asynchronous systems are inherently
`non-deterministic. Thus, two executions of the same program may
`produce different, but nevertheless valid, ordering of events. This makes
`the reproduction of errors and the creation of certain test conditions
`difficult, if not impossible, at times (monitoring information can be used to
`reproduce test conditions if monitoring interference can be minimized
`sufficiently)
`• monitoring interference: the dependencies between different processes
`in a distributed system are such that any change in the behaviour of one
`process can alter the behaviour of the entire system. The inclusion of
`
`•
`
`12
`
`Monitoring in Distributed Systems
`
`APM.1008.01
`
`HP_1021_0018
`
`

`

`ANSA Phase III
`
`Distribution and monitoring
`
`•
`
`monitoring in a distributed system can alter the behaviour of a program in
`a manner which is important to the observer
`replication: in a distributed system an object may be implemented as a
`replicated group [AR.002 93]. When wishing to monitor such an object it is
`necessary to have the appropriate facilities to deal with both cases:
`— a group with replication transparency
`— a group without replication transparency
`• migration: in a distributed system objects may migrate from one system
`to another. This will cause difficulties with the control and collation of the
`monitored objects. The appropriate monitoring and management facilities
`must deal with such cases
`• passivation: objects may be passivated [AR.006]. It is necessary to
`decide what the meaning of monitoring a passivated object is and how to
`notify the monitoring session of such a case
`objects, encapsulation and security: One of the problems with
`monitoring in an object-oriented system is that the notion of monitoring is
`directly opposed to one of the fundamental characteristics of such
`systems, namely that of encapsulation.Ensuring that the state of objects
`and their associated procedures are protected from external observation
`and interference creates a conflict with the need to monitor those objects.
`For example, the incorporation and usage of monitoring facilities in an
`object may clash with security requirements.
`objects administer their own management: in contrast to centralized
`systems which are characterized by a central management entity,
`management facilities are distributed to the objects. This also applies to
`facilities for management of monitoring
`• monitoring as a distributed activity: the monitoring of a distributed
`system is itself a distributed activity and it therefore requires:
`— tools which allow the management of the process of monitoring access
`and use to remote resources
`— that the monitoring services and their associated management
`structures do not interfere with the performance of the system to an
`unacceptable degree and that they scale well when active in large
`distributed systems
`— dynamic and selective monitoring activation: the ability to define
`the granularity of monitoring, activate it at run-time and modify it as
`the need arises without re-compilation. The granularity of monitoring
`is the level to which a single activity can be monitored in an object
`without having to activate the entire monitoring in the object
`agents and roles: in a distributed system: the assumption that the same
`agent in the same location may carry out several roles does not hold in
`distributed systems. For example, the application programmer,
`application user and the observer roles may be carried out by different
`agents in different locations
`visualization and system models: this concerns the need to present the
`data produced during a monitoring session to the user in an intelligible
`form, relating it to known models of the system. Distribution adds an
`extra level of complexity to intelligible presentation of monitoring
`
`•
`
`•
`
`•
`
`•
`
`APM.1008.01
`
`Monitoring in Distributed Systems
`
`13
`
`HP_1021_0019
`
`

`

`Distribution and monitoring
`
`ANSA Phase III
`
`information. Special analysis and visualization tools are therefore
`essential.
`
`3.4 Conclusions about monitoring in a distributed system
`
`•
`
`The problems cited above which distribution introduces to monitoring can be
`grouped together into three major problem areas:
`the definition, design and incorporation of a monitoring and
`•
`management infrastructure to facilitate the dynamic monitoring of
`distributed systems
`ordering and reconstruction of the flow of events in a distributed
`system from the monitoring information: the transformation of a
`collection of monitoring information of local events into a global picture.
`The ability to reconstruct can be seen as a pre-requisite for providing
`useful presentations of monitoring information
`the visualization of monitoring information in order to provide the
`observer with useful models of the system and the activities in it.
`
`•
`
`14
`
`Monitoring in Distributed Systems
`
`APM.1008.01
`
`HP_1021_0020
`
`

`

`4 Approach to monitoring in distributed
`systems
`
`4.1
`
`Introduction
`
`This chapter presents the approach to monitoring and management in object-
`based federated distributed systems, based on the problems cited in Chapter
`3. A more detailed description of the approach and the rationale behind it is
`given in [TR.39 93].
`
`4.2 Monitoring and its management in object-based federated distributed
`systems
`
`The principles of encapsulation and autonomy of objects means that each
`object will have its management service and an interface to it (Figure 4.1).
`
`Figure 4.1: Objects have their own monitoring management service and an interface to it
`
`Management
`interface
`
`Mon
`Service
`
`Service
`
`Service
`interface
`
`Obtaining a referenc

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket