(*) Notice:
References Cited

`IBM Technical Disclosure Bulletin, “Enhanced Method for

(21) Appl. No.: 09/089,961
1/97, vol. 40, No.
1 (pp. 111-112).
IBM Technical Disclosure Bulletin,
Jun. 3, 1998
Mechanism for Automated Problem Determination Service

Agents on IBM Local Area Network Server Network,"


vol. 39.

10/96, vol. 39,
No. 10 (pp. 191-192).


IBM Technical Disclosure Bulletin, "Combining Multiple

Layers of Configuration Models into a Single Report," 3/94,


vol. 37, No. 3 (pp. 557-560)


Primary Examiner—Zarni Maung,
Assistant Examiner—Jason D. Cardone
(74) Attorney, Agent, or Firm—Duke W. Yee; Jeffrey S.
LaBaw; Stephen R. Tkacs
Page 2 of 15
Page 2 of 15


`US 6,549,932 B1
`1. Technical Field
`The present invention is directed to managing a large
`distributed computer enterprise network and, more
`to performing discovery operations therein
`preferably using software components that are deployed in
`the network and adapted to be executed in local runtime
`2. Description of the Related Art
`Today, companies desire to place all of their computing
`resources on the company network. To this end, it is known
`to connect computers in a large, geographically-dispersed
`network environment and to manage such an environmentin
`a distributed manner. One such management framework
`comprises a server that manages a numberof nodes, each of
`which has a local object database that stores object data
`specific to the local node. Each managed node typically
`includes a management framework, comprising a number of
`management routines, that is capable of a relatively large
`number(e.g., hundreds) of simultaneous network connec-
`tions to remote machines. As the number of managed nodes
`increases, the system maintenance problemsalso increase,
`as do the odds of a machine failure or other fault.
`The problem is exacerbated in a typical enterprise as the
`node number rises. Of these nodes, only a small percentage
`are file servers, name servers, database servers, or anything
`but end-of-wire or “endpoint” machines. The majority of the
`network machines are simple personal computers (“PC’s”)
`or workstationsthatsee little managementactivity during a
`normal day.
`System administrators typically manage such environ-
`ments through system and network tasks that are configured
`by the administrator on some local machine and then dis-
`tributed or deployed into the network. A machine that is to
`receive the task is referred to as a deployment“target”. The
`locations and characteristics of the target machines,
`however, are typically determined by the administrator
`manually. Thus, for example, if the task to be deployed is a
`database management application,
`the administrator must
`specify the particular database servers in the network. This
`process is cumbersome and time-consuming, especially as
`the size of the network increases to include thousands of
`connected machines. If the system administrator does not
`specify all target machines, the system administration task
`may be implemented incorrectly. Alternatively, if the num-
`ber and location of targets is over-specified, network
`resources are consumed unnecessarily.
`In addition, there are many other reasons why network
`administrators have an interest in performing so-called “dis-
`covery” operations in such a large managed environment. As
`one example, an administrator maydesire to determine how
`many and which machines in the environment presently
`support a given version of a software program. Discovery
`may also be required to determine whether a particular
`machinehassufficient resource (e.g., available disk storage)
`to support a software upgrade. Yet another reason to perform
`a discovery operation might simply involve a need or desire
`to perform system or resource inventoryto facilitate plan-
`ning for future enterprise expansion. The nature and types of
`discovery: operations are thus quite varied.
`Page 8 of 15
`Known distributed management architectures do not
`afford the system administrator the ability to issue a distri-
`bution request and deploy a task without having to manually
`associate the tasks with given groups of machines. Likewise,
`such known techniques have not been readily adapted to
`facilitate a wide range of basic discovery operations that are
`desiredto facilitate system administration, management and
`maintenance in such an environment, especially as the
`network growsto include thousands of connected, managed
`The present invention addresses these and other associ-
`ated problems of the prior art.
`It is thus a primary object of this invention to perform
`discovery operations in a distributed computer enterprise
`in which a large number of machines are
`connected and managed.
`It is another primary object of this invention to deploy
`discovery agents in the distributed computer network that
`are executed in local runtime environments to perform such
`discovery operations.
`Another primary objective of this invention is to provide
`software components that are readily deployed into a
`distributed, managed environment for discovering given
`facts (e.g., machine and/or source identity, characteristics,
`state, status, attributes, and the like) that are then useful in
`controlling a subsequent operation(e.g., a task deployment).
`A more specific object of this invention is to provide a
`mechanism by which a dispatcher may identify particular
`machinesthat are candidates to receive a task deployment so
`that an administrator or other user need not manually
`associate the task with given groups of machines.
`It is a particular object of this invention to deploy a
`Java-based software “discovery agent” into a distributed
`computer network environment
`to discover particular
`machines or resources that are to be targeted to receive a
`particular task deployment within the network.
`A further object of this invention to launch a set of one or
`more discovery agents into a large, distributed computer
`network in response to a given request for the purpose of
`identifying and locating suitable target machines or
`resources for receipt of a given task. The task may be an
`administrative task, a management task, a configuration
`task, or any other application.
`A further specific object of this invention is to customize
`or tailor the software agent dispatched in the network for
`discovery purposes as a function of the type of task to be
`subsequently deployed.‘hus, the software agent may more
`readily determine whether a candidate machine may qualify
`as a potential target for the deployment.
`Yet another more general object of this invention is to
`more fully automate the discoveryof distribution targets in
`a large, distributed computing network and thereby reduce
`the expense and complexity of system administration.
`Another object of the present invention is to initially
`dispatch a minimum amountof code that may be necessary
`to discoverdistribution targets for a subsequent task deploy-
`ment in a large computer network.
`It is a further object of this invention to deploya self-
`routing software agent into a distributed computer network
`to discover workstations that satisfy a given criteria. During
`a particular search, a given agent may “clone” itself at a
`particular node to continue the search along a new network
`Page 8 of 15


`US 6,549,932 B1
`runs as a standalone process using
`the software agent
`existing local resources. When the suitability of the work-
`station (as a target machine) is indeterminate, the sofiware
`agent may obtain additional code from the dispatch mecha-
`nism or from some other network source to facilitate its
`determination. Such additional code may be another soft-
`ware agent.
`While one preferred “discovery” operation involves a
`determination of whether a given machine or resource is a
`suitable target for a task deployment, other more discovery
`operations may be implemented in like manner. Thus, a
`discovery operation may be implemented for inventory
`control, for determining which machines support which
`versions of given software, for determining the ability of a
`given machine or an associated resource to support given
`software or to perform a given task, and the like.
`The foregoing has outlined some of the more pertinent
`objects of the present invention. These objects should be
`construed to be merely illustrative of some of the more
`prominent features and applications of the invention. Many
`other beneficial results can be attained by applying the
`disclosed invention in a different manner or modifying the
`invention as will be described. Accordingly, other objects
`and a fuller understanding of the invention may be had by
`referring to the following Detailed Description of the pre-
`ferred embodiment.
`Yet another more general object of the present invention
`is to collect
`information about workstations in a large
`computer networked environment as mobile discovery
`agents are dispatched and migrated throughout the network.
`These and other objects of the invention are achieved by
`the disclosed system, methad and computer product for
`discovery in a large, distributed computer networking envi-
`ronment. A management
`infrastructure supported in the
`networking environment includes a dispatch mechanism,
`which is preferably located at a central location (e.g., an
`administrative server), and a runtime environment supported
`on given nodes of the network. In particular, the runtime
`environment(e.g., an engine) is preferably part of a distrib-
`uted framework supported on each managed node of the
`distributed enterprise environment.
`Onepreferred method begins upona distribution request.
`The distribution request is not limited to any particular type
`of system or network administration, configuration or man-
`In response to the request,
`the dispatch
`mechanism determines whether the machines targeted for ,
`the deployment (namely,
`the “target machines”) can be
`identified from local sources (e.g., a local repository of
`previously-collected or generated configuration
`information). If such information is not available or it
`otherwise not useful, the dispatch mechanism deploysinto
`the network a set of one or more “discovery agents” that are
`tasked Lo locate and identifysuitable target(s) for the deploy-
`ment. These one or more agents then “fan-out” into the
`network to collection information to facilitate subsequent
`task deployment. Preferably, the discovery agent is a small
`piece of code that is customized or tailored as a function of
`the particular task to be later deployed. This customization
`reduces the time necessary to complete an overall search
`because the agent
`thus may be “tuned” to evaluate the
`candidate node for a particular characteristic. If that char-
`acteristic is not present, the software agent may then proceed
`elsewhere (or clone itself to follow a new network path).
`Whena particular discovery agentarrives at a node in the
`network, the software agent preferably is linked into the
`local runtime environmentalready presentto thereby initiate
`a local discovery process. The discovery routine executed by
`the discovery agent may discoverthat the local machine (or
`some resource or application thereon) is a suitable target,
`that the local machine (or some application thereon)is not
`a suitable target, or that insufficient information is available
`to make this determination. Based on information obtained
`during the discovery process, the software agent also may
`identify one or more new network paths that must be
`traversed to continue the discovery process and thereby
`complete the search. The software agent may then launch <;
`itself to another node, or it may “clone”itself and launch a
`“cloned” agent over the new network path as needed.
`If the software agent discovers that the candidate machine
`is a suitable target, ccrtain identifying information (c.g., a
`confirmation, a machine identifier, a state identifier or the
`like) is generated. The identifying information is then saved
`within a datastore associated with the agent (if the agentis
`to return to the dispatch mechanism)or, alternatively, such
`information is transmitted back to the dispatch mechanism
`Cif the agent is to extinguish itself upon completion of the
`discovery process). Such transmission maybe effected using
`a simple messaging technique. When a given network path
`is exhausted, the discovery agent then either returns to the
`dispatch mechanism or extinguishesitsclf, as the casc may
`Thus, at each node, the software agent is preferably run by
`the runtime engine previously deployed there. Alternatively,
`For a more complete understanding of the present inven-
`tion and the advantagesthercof, reference should be made to
`the following Detailed Description taken in connection with
`the accompanying drawings in which:
`FIG.1 illustrates a simplificd diagram showing a large
`distributed computing enterprise environment in which the
`present invention is implemented;
`TIG. 2 is a block diagram of a preferred system manage-
`ment framework illustrating how the framework function-
`ality is distributed across the gateway and its endpoints
`within a managed region;
`FIG.2A is a block diagramof the elements that comprise
`the LCF client componentof the system managementframe-
`FIG.3 illustrates a smaller “workgroup” implementation
`of the enterprise in which the server and gatewayfunctions
`are supported on the same machine;
`FIG. 4 is a distributed computer network environment
`having a managementinfrastructure for use in carrying out
`the preferred method of the present invention;
`FIG. 5 is a flowchart illustrating a preferred method of
`deploying a software discovery agent
`in response to a
`distribution request in the computer network; and
`FIG.6 is a flowchart of a software agent local discovery
`mechanism according to the preferred embodiment of this
`Referring now to FIG. 1,
`the invention is preferably
`implementedin a large distributed computer environment 10
`comprising up to thousands of “nodes.” The nodes will
`typically be geographically dispersed and the overall cnvi-
`ronment is “managed”in a distributed manner. Preferably,
`the managed environment (ME)is logically broken down
`into a series of loosely-connected managed regions (MR)
`12, each with its own managementserver 14 for managing
`Page 9 of 15
`Page 9 of 15


`US 6,549,932 B1
`local resources with the MR. The network typically will
`include other servers (not shown) for carrying out other
`distributed network functions. These include name servers,
`security servers, file servers, threads servers, time servers
`and the like. Multiple servers 14 coordinate activities across
`the enterprise and permit remote site management and
`operation. Each server 14 scrves a number of gateway
`machines 16, each of which in turn support a plurality of
`endpoints 18. The server 14 coordinates all activity within
`the MR using a terminal node manager 20.
`Referring now to FIG. 2, each gateway machine 16 runs
`a server component 22 of a system management framework.
`The server component 22 is a multi-threaded runtime pro-
`cess that comprises several components: an object request
`broker or “ORB” 21, an authorization service 23, object
`location service 25 and basic object adaptor or “BOA” 27.
`Server component 22 also includes an object library 29.
`Preferably, the ORB 21 runs continuously, separate from the
`operating system, and it communicates with both server and
`clicnt processes through scparate stubs and skelctons via an
`interprocess communication (IPC) facility 19. In particular,
`a secure remote procedure call (RPC) is used to invoke
`operations on remote objects. Gateway machine 16 also
`includes an operating system 15 and a threads mechanism
`The system management framework includes a client
`component 24 supported on each of the endpoint machines
`18. The clicnt component 24 is a low cost, low maintenance
`application suite that is preferably “dataless” in the sense
`that system management data is not cached or stored there
`in a persistent manner. Implementation of the management
`framework in this “client-server” manner has significant
`advantages over the prior art, and it facilitates the connce-
`tivity of personal computers into the managed environment.
`Using an object-oriented approach, the system management
`framework facilitates execution of system management
`tasks required to manage the resources in the MR. Such
`tasks are quite varicd and include, without limitation, file
`and data distribution, network usage monitoring, user
`management, printer or other
`resource configuration
`management, and the like.
`In the large enterprise such as illustrated in FIG. 1,
`preferably there is one server per MR with some number of
`gateways. For a workgroup-size installation (e.g., a local
`area network) such as illustrated in FIG. 3, a single server-
`class machine maybe used as the server and gateway, and
`the client machines would run a low maintenance frame-
`work References herein to a distinct server and one or more
`gateway(s) should thus not be taken by wayof limitation as
`these elements may be combinedinto a single platform. For
`intermediate size installations the MR growsbreadth-wise,
`with additional gateways then being usedto balance the load
`of the endpoints.
`The server is the top-lcvel authority over all gateway and
`endpoints. The server maintains an endpoint list, which
`keepstrack of every endpoint in a managed region. This list
`preferably contains all information necessary to uniquely
`identify and manage endpoints including, withoutlimitation,
`such information as name, location, and machine type. The
`server also maintains the mapping between endpoint and
`gateway, and this mapping is preferably dynamic.
`As noted above,
`there are onc or more gateways per
`managed region. Preferably, a gateway is a fully-managed
`node that has been configured to operate as a gateway. As
`endpoints login, the gateway builds an endpointlist for its
`endpoints. The gateway’s duties preferably include: listen-
`Page 10 of 15
`ing for endpoint login requests, listening for endpoint update
`requests, and (its main task) acting as a gateway for method
`invocations on endpoints.
`is a machine
`the endpoint
`As also discussed above,
`framework client
`running the system management
`component, which is referred to herein as the low cost
`ramework (LCF). The LCF has two mainparts asillustrated
`in FIG. 2A: the LCF daemon 24a and an application runtime
`ibrary 24b. The LCF daemon 24a is responsible for end-
`login and for spawning application endpoint
`executables. Once an executable is spawned, the LCF dae-
`mon 24a has no further interaction with it. Each executable
`is linked with the application runtime library 24b, which
`handles all further communication with the gateway.
`the server and each of the gateways is a
`computer or “machine.” For example, each computer may
`be a RISC System/6000® (a reduced instruction set or
`so-called RISC-based workstation)
`running the
`ATX( (Advanced Interactive Executive) operating system,
`preferably Version 3.2.5 or greater. Suitable alternative
`machines include: an IBM-compatible PC x86 or higher
`running, Novell UnixWare 2.0, an AT&T 30

