throbber
US007062683B2
`
`(12) United States Patent
`Warpenburg et a].
`
`(10) Patent N0.:
`(45) Date of Patent:
`
`US 7,062,683 B2
`Jun. 13, 2006
`
`(54) TWO-PHASE ROOT CAUSE ANALYSIS
`
`(75) Inventors: Michael R. Warpenburg, Austin, TX
`(US); Michael J. Scholtes, Austin, TX
`(Us)
`
`(73) Assignee: BMC Software, Inc., Houston, TX
`(Us)
`
`( * ) Notice:
`
`Subject to any disclaimer, the term of this
`patent is extended or adjusted under 35
`U.S.C. 154(b) by 541 days.
`
`New Product Review: PATROL® for Diagnostic Manage
`ment.” www.bmc.com/technews/002/root.html. Dec. 2002.
`12 pages.
`
`(Continued)
`Primary ExamineriScott Baderman
`Assistant Examineriloshua Lohn
`(74) Attorney, Agent, or F irmiWong, Cabello, Lutsch,
`Rutherford & Brucculeri, LLP
`
`(57)
`
`ABSTRACT
`
`(21) Appl. No.: 10/420,174
`
`(22) Filed:
`
`Apr. 22, 2003
`
`(65)
`
`Prior Publication Data
`
`US 2004/0225927 A1
`
`Nov. 11, 2004
`
`(51) Int. Cl.
`(2006.01)
`G06F 11/00
`(52) US. Cl. ........................... .. 714/43; 714/27; 714/56
`(58) Field of Classi?cation Search ................ .. 714/26,
`714/27, 56
`See application ?le for complete search history.
`
`(56)
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`
`4/1996 Dev et a1. ................. .. 709/223
`5,504,921 A *
`6,006,016 A * 12/1999 Faigon et a1. ............... .. 714/48
`6,072,777 A
`6/2000 Bencheck et a1.
`6,249,755 B1
`6/2001 Yemini et a1.
`2002/0086671 Al* 7/2002 Amin et al. .............. .. 455/432
`2003/0051195 Al* 3/2003 Bosa et al. ................. .. 714/43
`
`OTHER PUBLICATIONS
`
`BMC Software, Inc. Product Review. Roy, Gerry and Tracy
`Luciani. “A Modeling Approach to Root-cause Analysis:
`
`A two-phase method to perform root-cause analysis over an
`enterprise-speci?c fault model is described. In the ?rst
`phase, an up-stream analysis is performed (beginning at a
`node generating an alarm event) to identify one or more
`nodes that may be in failure. In the second phase, a down
`stream analysis is performed to identify those nodes in the
`enterprise whose operational condition are impacted by the
`prior determined failed nodes. Nodes identi?ed as failed as
`a result of the up-stream analysis may be reported to a user
`as failed. Nodes identi?es as impacted as a result of the
`down-stream analysis may be reported to a user as impacted
`and, bene?cially, any failure alarms associated with those
`impacted nodes may be masked. Up-stream (phase 1) analy
`sis is driven by inference policies associated with various
`nodes in the enterprise’s fault model. An inference policy is
`a rule, or set of rules, for inferring the status or condition of
`a fault model node based on the status or condition of the
`node’s immediately down-stream neighboring nodes. Simi
`larly, down-stream (phase 2) analysis is driven by impact
`policies associated with various nodes in the enterprise’s
`fault model. An impact policy is a rule, or set of rules, for
`assessing the impact on a fault model node based on the
`status or condition of the node’s immediately up-stream
`neighboring nodes.
`
`90 Claims, 8 Drawing Sheets
`
`100
`
`9
`
`105
`
`110
`
`RECEIVE ALARM
`
`l
`
`PERFORM UP
`STREAM ANALYSIS
`
`‘I’
`115
`\_ PERFORM DOWN
`STREAM ANALYSIS
`
`120
`
`i
`
`REPORT RESULTS
`
`ServiceNow, Inc,.'s Exhibit No. 1001
`
`001
`
`

`
`US 7,062,683 B2
`Page 2
`
`OTHER PUBLICATIONS
`Gensym Corporation. “Integrity SymCur Developer’s Guide
`Version 3.1.” Sep. 2000. 20 pages.
`
`Micromuse Inc. White Paper. “The Netcool®/OMNIbusTM
`System Architecture.” Mar. 2002. 14 pages.
`
`Micromuse Inc. White Paper. “Precise Root Cause Analysis:
`Taking the GuessWork Out of Solving Network Faults and
`Failures.” Jul. 2002. 14 pages.
`Smarts Corporation White Paper. “Downstream Suppression
`'NtR tC
`Anl '.”Ot.2002.l3
`.
`1S 0 O0 ause
`a ysls
`C
`pages
`* cited by examiner
`
`ServiceNow, Inc,.'s Exhibit No. 1001
`
`002
`
`

`
`U.S. Patent
`
`Jun. 13, 2006
`
`Sheet 1 6f 8
`
`US 7,062,683 B2
`
`100
`
`?
`
`105
`\- RECEIVE ALARM
`l
`
`\_ PERFORM UP
`STREAM ANALYSIS
`
`i
`115
`\_ PERFORM DOWN
`STREAM ANALYSIS
`
`l
`\- REPORT RESULTS
`
`FIG. 1
`
`ServiceNow, Inc,.'s Exhibit No. 1001
`
`003
`
`

`
`ServiceNow, Inc,.'s Exhibit No. 1001
`
`004
`
`

`
`ServiceNow, Inc,.'s Exhibit No. 1001
`
`005
`
`

`
`U.S. Patent
`
`Jun. 13, 2006
`
`Sheet 4 6f 8
`
`US 7,062,683 B2
`
`a 2.5
`
`
`
`2300 “.6,
`
`$25,228 A
`$202102 “H
`
`ESUJEE “H
`
`ml?
`
`WMWOQEOU
`
`ServiceNow, Inc,.'s Exhibit No. 1001
`
`006
`
`

`
`U.S. Patent
`
`Jun. 13, 2006
`
`Sheet 5 6f 8
`
`US 7,062,683 B2
`
`qmln
`
`5025 8
`
`NM.
`
`Q5025
`
`g
`
`2.652%; 8
`
`G26 2::
`olmm
`B<
`
`mmmOmsAOu
`
`ServiceNow, Inc,.'s Exhibit No. 1001
`
`007
`
`

`
`ServiceNow, Inc,.'s Exhibit No. 1001
`
`008
`
`

`
`U.S. Patent
`
`Jun. 13, 2006
`
`Sheet 7 6f 8
`
`US 7,062,683 B2
`
`FA ULT_MODEL ATM_EXAMPLE [
`
`CONDITION BINDER.NO_MONEY {
`INFERENCE_POLIC\/ ANY {}
`
`}
`
`CONDITION ATM_SRVC.DEGRADED {
`
`IMPACT_POLICY PERCENTAGE {
`THRESHOLD = 30;
`NODES ATM.DOWN;
`
`IMPACT_POLICV PERCENTAGE {
`THRESHOLD = 50,‘
`NODES ATM.NO_MONEY;
`
`IMPACT_POLICV PERCENTAGE {
`THRESHOLD = 50;
`NODES ATM.PAPER_J'AM;
`
`CONDITION ATM_SRVC.DOWN {
`IMPACT POLICY ALL {}
`
`CONDITION ATM.NO_MONEV {
`IMPACT_POLICY COUNT( THRESHOLD = 2;)
`
`}
`
`INFERENCE__POLIC\/ PERCENTAGE( THRESHOLD = 50;)
`
`FIG. 6B
`
`ServiceNow, Inc,.'s Exhibit No. 1001
`
`009
`
`

`
`U.S. Patent
`
`Jun. 13, 2006
`
`Sheet 8 6f 8
`
`US 7,062,683 B2
`
`00%
`
`A
`
`.mloM
`
`
`
`
`
`$8628 88%8 $8038
`
`$8628
`
`5‘ N<
`
`$8128
`
`
`230m 2300
`
`02 38658 OR
`' \%
`
`at QR
`
`60280 z>>oo
`
`
`6< $8628 rk
`
`I J)
`NE <<<<Ou L
`
`QM
`
`~>
`
`Z300
`
`ServiceNow, Inc,.'s Exhibit No. 1001
`
`010
`
`

`
`US 7,062,683 B2
`
`1
`TWO-PHASE ROOT CAUSE ANALYSIS
`
`BACKGROUND
`
`The invention relates generally to the ?eld of event
`detection and fault diagnosis for computer systems and,
`more particularly but not by Way of limitation, to techniques
`(devices and methods) for de?ning and using fault models
`for the monitoring, diagnosis and recovery of error condi
`tions in a enterprise computing system.
`Contemporary corporate computer netWorks comprise a
`plurality of different computer platforms and softWare appli
`cations interconnected through a number of different paths
`and various hardWare devices such as routers, gateWays and
`sWitches. Workstations, dedicated ?le, application and mail
`servers and mainframe computer systems. Illustrative soft
`Ware applications include accounting, payroll, order entry,
`inventory, shipping and database applications. The collec
`tion of such entitiesihardWare and softWareiis often
`referred to as an “enterprise.”
`As enterprises have become larger and more complex,
`their reliability has become ever more dependent upon the
`successful detection and management of problems that arise
`during their operation. Problems can include hardWare and
`softWare failures, hardWare and softWare con?guration mis
`matches and performance degradation due to limited
`resources, external attacks and/or loss of redundancy. Opera
`tional problems generate observable events, and these events
`can be monitored, detected, reported, analyZed and acted
`upon by humans or by programs. It has been observed that
`as an enterprise groWs (i.e., incorporates more monitored
`componentsihardware and software), the rate at Which
`observable events occur increases dramatically. (Some stud
`ies indicate event generation rates increase exponentially
`With enterprise siZe.) Quickly and decisively identifying the
`cause of any given problem can be further complicated
`because of the large number of sympathetic events that may
`be generated as a result of an underlying problem(s). In the
`?eld of enterprise monitoring and management, the large
`number of sympathetic events that are generated as a result
`of one, or a feW, underlying root cause failures, is often
`referred to as an “alert storm.” For example, a router failure
`may generate a “router doWn” event and a large number of
`“lost connectivity” events for components that communicate
`through the failed router. In this scenario, the router failure
`is the fundamental or “root cause” of the problem and the
`lost connectivity events are “sympathetic” events. Studies
`have estimated that up to 80% of a netWork’s doWn-time is
`spent analyZing event data to identify the underlying prob
`lem(s). This doWn-time represents enormous operational
`losses for organizations that rely on their enterprises to
`deliver products and services.
`One prior art approach to enterprise diagnosis relies on
`user speci?ed rules of the form: 1r (CONDITION-A) AND/OR/NOT
`(CONDITION-B) .
`.
`. (CONDITION-N) THEN (CONDITION-Z). Known as
`a “rules-based” approach, these techniques monitor the
`enterprise to determine the sate of all tested conditions (e.g.,
`conditions A, B and N). When all of a rule’s conditions are
`true, that rule’s conclusion state is asserted as true (e.g.,
`condition-Z). While this approach has the advantage of being
`easy to understand, it is virtually impossible to implement in
`any comprehensive manner for large enterprises. (The num
`ber of possible error states (i.e., combinations of conditions
`A, B .
`.
`. N) giving rise to a fault (e.g., conclusions), groWs
`exponentially With the number monitored components.) In
`addition, rules-based approaches are typically tightly
`coupled to the underlying enterprise architecture such that
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`2
`any changes in the architecture (e.g., the addition of moni
`tored components) requires changes (e.g., the addition of
`rules) to the underlying rule-set. Further, in a dynamic
`environment Where monitored components are added and/or
`removed on a Weekly or daily basis, rules-based approaches
`becomes nearly impossible to implement in a controlled and
`reliable manner because of the overhead associated With
`creating and/or modifying rules for each component added
`or the removal of one or more rules for each component
`removed.
`Another prior art approach to enterprise diagnosis is
`knoWn as “pattern matching.” This approach also uses rules
`but, unlike the rules-based analysis introduced above, alloWs
`tWo-Way reasoning through a rule-set. For example, if a rule
`of the form 1r (coNDITIoN-A)AND (CONDITION-B) THEN (CONDITION
`c) exists and both condition-A and condition-C are knoWn to
`be true (e.g., through monitoring or measurement), these
`systems can infer that condition-B must also be true. This,
`in turn, may alloW the satisfaction of additional rules. While
`more poWerful and ?exible than standard rules-based sys
`tems, pattern matching systems, like rules-based systems,
`must have all (or a su?icient number) of their error states
`de?ned a priori and are dif?cult to maintain in a dynamic
`environment.
`Yet another prior art approach to enterprise diagnosis uses
`signed directed graphs (digraphs) to propagate enterprise
`variable status betWeen different modeled nodes, Where each
`node in such a model (a fault model) represents a condition
`of a modeled component in the enterprise. For example, a
`node in an enterprise fault model may represent that port-1
`in router-A has failed or that Database Management Appli
`cation-Z has less than a speci?ed amount of Working buffer
`memory available. Many digraph implementations use
`generic, class-level models of the monitored components in
`a classic object-oriented approach and, for this reason, are
`often referred to as Model Based Reasoning (MBR) tech
`niques. Unlike rule-based systems, MBR systems provide a
`scalable method to identify and propagate detected error
`states to one or more “event correlation” engines. Event
`correlation engines, in turn, provide the computational sup
`port to analyZe the myriad of event indications and to
`determine, based on preprogrammed logic, What a likely
`root-cause of the monitored events is. Heretofore, the ability
`to correlate error states in a highly dynamic environment and
`to account for the possibility of more than one simultaneous
`error or fault has stunted the ability of MBR systems to
`perform up to their theoretical limits. Accordingly, it Would
`be bene?cial to provide improved event correlation and
`analysis techniques for MBR systems.
`
`SUMMARY
`
`In one embodiment the invention provides an enterprise
`fault analysis method, Where an enterprise includes a plu
`rality of monitored components some of Which may be
`hardWare and some of Which may be softWare. In various
`embodiments, the enterprise (or that portion of the enterprise
`being monitored) is represented by a fault model having a
`plurality of communicatively coupled nodes. The method
`includes receiving an event noti?cation from a ?rst node,
`performing an up-stream analysis of the fault model begin
`ning at the ?rst node, identifying a second node (the second
`node having a status value modi?ed during the up-stream
`analysis to indicate a failed status), performing a doWn
`stream analysis of the fault model beginning at the second
`node, identifying those nodes in a contiguous path betWeen
`the second node and the ?rst node Whose impact values
`
`ServiceNow, Inc,.'s Exhibit No. 1001
`
`011
`
`

`
`US 7,062,683 B2
`
`3
`indicate an impacted performance condition in accordance
`with the down-stream analysis, reporting the second node as
`a root cause of the received event noti?cation, and reporting
`at least one of the identi?ed nodes as impacted by the root
`cause of the received event noti?cation and not as root
`causes of the received event noti?cation. The method may
`be stored in any media that is readable and executable by a
`computer system.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`FIG. 1 shows, in ?owchart form, an enterprise monitoring
`and analysis method in accordance with one embodiment of
`the invention.
`FIG. 2 shows, in ?owchart form, an up-stream analysis
`technique in accordance with one embodiment of the inven
`tion.
`FIG. 3 shows, in ?owchart form, a down-stream analysis
`technique in accordance with one embodiment of the inven
`tion.
`FIGS. 4A and 4B present a Management Schema in
`accordance with one embodiment of the invention for an
`illustrate ATM enterprise.
`FIGS. 5A and 5B present a Managed Object Inventory for
`the Management Schema of FIGS. 4A and 4B.
`FIGS. 6A and 6B present a Fault Model for the Manage
`ment Schema and Managed Object Inventory of FIGS. 4A,
`4B, 5A and 5B.
`FIG. 7 shows an Impact Graph in accordance with the
`illustrative enterprise of FIGS. 4*6.
`
`DETAILED DESCRIPTION
`
`The invention relates generally to the ?eld of event
`detection and fault diagnosis for computer systems and,
`more particularly but not by way of limitation, to methods
`and devices for de?ning and using fault models for the
`monitoring, diagnosis and recovery of error conditions in an
`enterprise computing system.
`The following embodiments of the invention, described in
`terms of model based reasoning approaches using object
`oriented characterizations of monitored components, are
`illustrative only and are not to be considered limiting in any
`respect. Speci?cally, the following embodiments of the
`invention utiliZe an object-oriented modeling approach for
`characterizing: (l) monitored components; (2) their physical
`and/or logical connectivity and (3) the propagation of
`detected anomalies or faults. In this approach, each compo
`nent (hardware, software and logical) that is to be monitored
`is de?ned by a software object that characteriZes that object
`in terms of its function and possible relationships with other
`modeled objects. The collection of all such object de?nitions
`is referred to as the Management Schema. The topology or
`architecture of a speci?c enterprise comprising objects from
`the Management Schema is referred to as the Managed
`Object Inventory. The structure, organiZation or connectivity
`of the Managed Object Inventory may be speci?ed by a user,
`determined automatically via auto-discovery techniques or
`by a combination of these approaches. Based on the Man
`agement Schema, a Fault Model (a directed graph or
`digraph) may be determined, wherein each node in the Fault
`Model represents a “condition” of a modeled component.
`Thus, if a single managed object (i.e., an object in the
`Management Schema) is characteriZed by a plurality of
`conditions, it may be represented by a plurality of nodes in
`a Fault Model. The topology or architecture of a speci?c
`Fault Model is referred to as an Impact Graph.
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`4
`As described above, Management Schema and Fault
`Models are class-level entities. They de?ne abstract objects
`(e.g., network routers and applications) in terms of what
`“conditions” those objects may be associated with and the
`allowable “relationships” between different objects. In con
`trast, Managed Object Inventories and Impact Graphs rep
`resent speci?c instances of a Management Schema and Fault
`Model respectively.
`In accordance with object-oriented approaches to the
`invention, one or more nodes in an Impact Graph have an
`associated inference policy and/or impact policy. As used
`herein, an inference policy is a rule, or set of rules, for
`inferring the status or condition of a fault model node based
`on the status or condition of the node’s immediately down
`stream neighboring nodes, wherein a down-stream node is a
`node that is coupled to another (digraph) node in the
`direction of information ?ow. Similarly, an impact policy is
`a rule, or set of rules, for assessing the impact on a fault
`model node based on the status or condition of the node’s
`immediately up-stream neighboring nodes, wherein an up
`stream node is a node that is coupled to another (digraph)
`node in the direction against information ?ow. Typically,
`nodes in a fault model in accordance with the invention
`utiliZe a status value (e.g., a Boolean value to indicate
`whether a node is failed or not-failed, or a real number such
`as 0.0 to 1.0) to record a node’s status or condition and an
`impact value (e.g., a Boolean value to indicate whether a
`node is impacted or not-impacted by its up-stream neigh
`bors, or a real number such as 0.0 to 1.0) to record the node’s
`impact value.
`Referring to FIG. 1, a model based reasoning (MBR)
`approach 100 to enterprise monitoring and fault analysis in
`accordance with the invention uses a combination of up
`stream analysis (based on the evaluation of inference poli
`cies) and down-stream analysis (based on the evaluation of
`impact policies) on an Impact Graph to e?iciently and
`effectively identify and isolate root cause faults from the
`myriad of event noti?cations or alarms, many or most of
`which may be “sympathetic,” that one or more underlying
`fault conditions may trigger. On event noti?cation (block
`105), an up-stream analysis of the Impact Graph beginning
`with the node receiving the event noti?cation is performed
`(block 110). Up-stream analysis in accordance with block
`110 may modify the status value of Zero or more nodes in the
`enterprise’s Impact Graph up-stream from the node receiv
`ing the event noti?cation. Next, the furthest up-stream node
`(relative to the node receiving the initial event noti?cation)
`whose status value was modi?ed in accordance with block
`110 is selected as a starting point from which a down-stream
`analysis is performed (block 115). Down-stream analysis in
`accordance with block 115 may modify the impact value of
`Zero or more nodes in the enterprise’s Impact Graph down
`stream from the down-stream analysis’ starting node. (One
`of ordinary skill in the art will recogniZe that if there is more
`than one node equally distant from the node receiving the
`event noti?cation and which has had its status value modi
`?ed in accordance with block 110, an arbitrary one of these
`nodes may be selected to begin the down-stream analysis in
`accordance with block 115.)
`With up-stream and down-stream analysis completed
`enterprise status, including identi?cation of one or more
`root-cause failures and identi?cation of sympathetic event
`noti?cations, may be reported (block 120). In general, those
`furthest up-stream nodes in the Impact Graph having a status
`value indicative of failure are identi?ed as “root causes.”
`Identi?ed root causes may be displayed in any convenient
`manner to a user. For example, identi?ed root-cause failures
`
`ServiceNow, Inc,.'s Exhibit No. 1001
`
`012
`
`

`
`US 7,062,683 B2
`
`5
`may be displayed graphically on a enterprise monitor con
`sole in a manner that highlights their status (e.g., shown in
`red). Nodes doWn-stream from the identi?ed root-causes and
`Whose impact values indicate they are impacted by the
`root-cause failures may also be shoWn to a user. It has been
`found bene?cial, to mask event noti?cation (e.g., alarms)
`associated With impacted nodes and/or to display them in a
`manner distinct from the root-causes. For example, impacted
`nodes may be shoWn in yelloW and their event or alarm
`noti?cations moved from an “Alarm List” to an “Impacted
`List,” both of Which may be displayed to a user.
`In one embodiment, status values are restricted to the
`Boolean values of FAILURE and NO-FAILURE and impact
`values are restricted to the Boolean values of IMPACTED
`and NOT-IMPACTED. That is, if a node’s status value is
`true, then it is in failure and if a node’s impact value is true,
`then it is impacted by one or more up-stream nodes. In
`another embodiment, either or both of a node’s status and
`impact values may be real numbers. When real numbers are
`used to represent status and/or impact values (i.e., the result
`of evaluating inference and impact policies), one of ordinary
`skill in the art Will recogniZe that a threshold function may
`be applied to interpret the determined status into a decision:
`failure/no-failure or impacted/not-impacted. For example,
`fuZZy logic techniques may be used to reduce real values to
`decision values. In yet another embodiment of the invention,
`either or both of a node’s status and impact values may have
`associated attributes. For example, a node’s status value may
`be “measured” or “inferred.” A measured status value is a
`value obtained through a direct measurement of the moni
`tored object. An inferred status value is a value determined
`through evaluation of a node’s inference policy. With respect
`to measured and inferred values, it has been found bene?cial
`to give a priority to measured values. For example, if a node
`has a measured status value of NO-FAILURE, and execution
`of the node’s inference policy during up-stream analysis
`Would infer a status value of FAILURE, the measured
`NO-FAILURE value is given precedence. That is, the node’ s
`status value is not changed.
`In addition, either or both of a node’s status and impact
`values may have an associated temporal attribute. For
`example, a node’s status value may be alloWed to “decay”
`over a ?rst time period from a measured FAILURE value to
`an inferred FAILURE value, and from an inferred FAILURE
`value to an inferred NO-FAILURE value over a second time
`period. In addition, a measured NO-FAILURE value may be
`alloWed to decay to an inferred NO-FAILURE value over a
`third time period. Similarly, a node’s impact value may be
`alloWed to “decay” from an IMPACTED value to a NOT
`IMPACTED value over a fourth time period. The rate of
`“decay” over Which status and impact values may be
`alloWed to change Will depend upon the enterprise being
`monitored and is a matter of design choice. In one embodi
`ment, a node’s “decay time” may relate to the underlying
`component’s mean-time to repair. For example, if a moni
`tored router or database application has a mean-time to
`repair of thirty (30) minutes, then its status value may decay
`from “measured failure” to “inferred failure” after thirty (30)
`minutes. Other user-speci?ed time-frames may also be used.
`It has been found bene?cial to select a time period Which
`indicates, to the party implementing the method, that the
`measured and/or inferred value retains its signi?cance. In
`addition, a node’s inferred status may be alloWed to decay
`in a similar fashion or, alternatively, be alloWed to change
`from one inferred value to another based on evaluation in
`accordance With the invention.
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`6
`Generally speaking, up-stream analysis of an Impact
`Graph begins at the node receiving an event noti?cation and
`proceeds, in an iterative fashion, up the graph until (1) there
`are no more up-stream nodes; or (2) a node’s status value
`does not change as a result of the node’s inference policy; or
`(3) the inferred status value for a node is different from the
`node’s measured status value.
`An up-stream analysis ?owchart in accordance With one
`embodiment of the invention is shoWn in FIG. 2. To begin,
`the node receiving the event noti?cation is selected (block
`200). Next, it is determined if the selected node has any
`immediately doWn-stream nodes (block 205). If it does not
`(the “No” prong of block 205), up-stream processing con
`tinues at block 220. If the selected node has at least one
`immediately doWn-stream node (the “Yes” prong of block
`205), the node’s inference policy is evaluated and its status
`value modi?ed in accordance With the results thereof (block
`210). If the node’s status value is not changed or the neWly
`inferred status value con?icts With the node’s “measured”
`status value (the “No” prong of block 215), processing
`continues at block 230. If the node’s status value is changed
`as a result of evaluating its inference policy and is not in
`con?ict With the node’s measured status value (the “Yes”
`prong of block 215), a test to determine if the selected node
`has any unevaluated (that is, a node Whose inference policy
`has not yet been evaluated during the up-stream analysis)
`immediately up-stream nodes (block 220). If no unevaluated
`up-stream nodes exist (the “No” prong of block 220),
`processing continues at block 230. If at least one unevalu
`ated immediately up-stream node exists (the “Yes” prong of
`block 220), one of the unevaluated nodes is selected (block
`225), and processing continues at block 210. If the status
`value of the currently selected node does not change or, if
`changed, is in con?ict With the node’s measured status value
`as a result of the acts of block 210 (the “No” prong of block
`215) or the currently selected node has no unevaluated
`immediately up-stream nodes (the “No” prong of block
`220), that node through Which the currently selected node
`Was selected is made the “selected” node (block 230). If the
`neWly selected node has at least one unevaluated immedi
`ately up-stream node (the “Yes” prong of block 235),
`processing continues at block 225. If the neWly selected
`node does not have any unevaluated immediately up-stream
`node (the “No” prong of block 235), a further check is made
`to determine if the neWly selected node is the starting node
`(block 240). If the neWly selected node is not the starting
`node (the “No” prong of block 240), processing continues at
`block 230. If the neWly selected not is the starting node (the
`“Yes” prong of block 240), up-stream processing is com
`plete (block 245). The acts ofblocks 210, 215, 220, 230, 235
`and 225 evaluate those nodes in the up-stream path from the
`selected starting node (block 200) in an iterative fashion.
`Generally speaking, doWn-stream analysis of an Impact
`Graph begins at the most up-stream node Whose status value
`Was changed during the up-stream analysis phase (block
`110). The starting node’s impact policy, and each successive
`immediately doWn-stream node’s impact policy are then
`evaluated until (1) there are no more doWn-stream nodes or
`(2) a doWn-stream node’s impact value does not change as
`a result of the evaluation.
`A doWn-stream analysis ?oWchart in accordance With one
`embodiment of the invention is shoWn in FIG. 3. On
`completion of the up-stream analysis (block 110), that node
`that is furthest up-stream from the node receiving the event
`noti?cation is selected as a starting node (block 300). If
`more than one node is equally distant (i.e., up-stream) from
`the node receiving the event noti?cation, an arbitrary one of
`
`ServiceNow, Inc,.'s Exhibit No. 1001
`
`013
`
`

`
`US 7,062,683 B2
`
`20
`
`25
`
`30
`
`7
`these nodes may be selected to begin the doWn-stream
`analysis. If the selected starting node has at least one
`immediately up-stream node (the “Yes” prong of block 305),
`doWn-stream processing continues at block 330. If the
`selected starting node does not have any immediately up
`stream nodes (the “No” prong of block 305), a check is made
`to determine if the selected node has any unevaluated (that
`is, a node Whose impact policy has not yet been evaluated
`during the doWn-stream analysis) immediately doWn-stream
`nodes (block 310). If the selected node has no unevaluated
`immediately doWn-stream nodes (the “No” prong of block
`310), a check is made to determine if the selected node and
`the starting node are the same (block 315). If the selected
`node is not the starting node (the “No” prong of block 315),
`processing continues at block 340. If the selected node is the
`starting node (the “Yes” prong of block 315), the doWn
`stream analysis is complete (block 320). If, on the other
`hand, the selected node has at least one unevaluated imme
`diately doWn-stream node (the “Yes” prong of block 310),
`one of these nodes is made the “selected” node (block 325)
`and its impact policy is evaluated (block 330). If the selected
`node’s impact value is not changed as a result of evaluating
`its impact policy (the “No” prong of block 335), doWn
`stream processing of the current “branch” of the Impact
`Graph is complete and that node through Which the currently
`selected node Was selected is made the “selected” node
`(block 340) and processing continues at block 310. The acts
`of blocks 310, 325, 330, 335 and 340 evaluate those nodes
`in the doWn-stream path from the selected starting node
`(block 300) in an iterative fashion.
`By Way of example, consider an enterprise consisting of
`a plurality of Automatic Teller Machines (ATMs) that are
`coupled to a central banking facility via a satellite commu
`nications systems. Referring to FIG. 4, Management
`Schema 400 for a simple ATM enterprise is shoWn com
`prising VSAT object 405, ATM object 410, Binder object
`415 and ATM Service object 420 (ATM SRVC). In this
`example, VSAT object 405 represents a satellite communi
`cations network, ATM object 410 represents a physical ATM
`machine, Binder object 415 represents a money dispensing
`40
`mechanism (each ATM machine typically includes as many
`binders as types of bills that it dispenses) and ATM Service
`object 420 represents a logical collection of services to
`Which one or more ATM devices may belong. As shoWn,
`VSAT 405 is coupled to ATM 410 by a VSAT_SRVS (V SAT
`serves) relationship, and ATM 410 is coupled to Binder 415
`by a HAS_MONEY_IN relationship and to ATM Service
`420 by a COMPOSES relationship. In addition, VSAT 405,
`ATM 410 and ATM Service 420 have associated status
`variables identi?ed as DOWN. ATM 410 further has impact
`variables identi?ed as COMM_PROB (communication
`problem), NO_MONEY and PAPER_JAM and VSAT Ser
`vice 420 has an impact variable identi?ed as DEGRADED.
`For convenience and ease of discussion, it is assumed that all
`status and impact variables are Boolean. FIG. 4B provides
`pseudo-code de?nitions for each object identi?ed in Man
`agement Schema 400.
`Referring to FIG. 5, Managed Object Inventory 500 in
`accordance With Management Schema 500 comprises Bind
`ers B1 505 and B2 510 coupled to ATM A1 515 via
`HAS_MONEY_IN relations. Similarly, Binder B1 520 is
`coupled to ATM A2 525 through a HAS_MONEY_IN
`relation. Both ATM A1 515 and ATM A2 525 are coupled to
`ATM Service AS1 530 and VSAT V1 535 through COM
`POSES and VSAT_SRVS relations respectively. FIG. 5B
`provides a pseudo-code representation of Managed Object
`Inventory 500.
`
`50
`
`8
`FIG. 6 illustrates Fault Model 600 in accordance With
`Management Schema 400 and Managed Object Inventory
`500. As shoWn, Fault Model 600 comprises: VSAT DOWN
`condition node 605 coupled to ATM COMM_PROB con
`dition node 610 by a VSAT serves relation; ATM COM
`M_PROB condition node 610 is coupled to ATM DOWN
`condition node 615 through a Self relation (Where “Self”
`relations are an inherent relation provided in all object
`oriented environments); ATM DOWN condition node 615 is
`coupled to ATM Service DOWN condition node 620 and
`ATM Service DEGRADED condition node 625 through
`COMPOSES relations; Binder NO_MONEY condition
`node 630 is coupled to ATM NO_MONEY condition node
`635 through a HAS_MONEY_IN relation; and ATM
`NO_MONEY condition node 635 and ATM PAPER_JAM
`condition node 640 are coupled to ATM Service
`DEGRADED condition node 625 via COMPOSES r

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket