throbber
US 6,691,067 B1
`(10) Patent No.:
`(12) United States Patent
`Feb. 10, 2004
`(45) Date of Patent:
`Dingetal.
`
`
`US006691067B1
`
`(54) ENTERPRISE MANAGEMENT SYSTEM AND
`METHOD WHICH INCLUDESSTATISTICAL
`RECREATION OF SYSTEM RESOURCE
`USAGE FOR MORE ACCURATE
`MONITORING, PREDICTION, AND
`PERFORMANCE WORKLOAD
`CHARACTERIZATION
`
`(75)
`
`Inventors: Yiping Ding, Dover, MA (US);
`(3) Newman, Cambridge, MA
`
`9/2001 Urano et al. oo... 709/223
`1/2003 Hafez et al. we. 709/224
`
`5/2003 Hafez et al. 0... 709/224
`5/2003 Ding etal. oe 702/186
`
`6,289,379 B1 *
`6,513,065 Bl *
`6,560,647 B1 *
`6,564,174 Bl *
`* cited by examiner
`.
`.
`.
`Primary Examiner—Patrick Assouad
`(74) Attorney, Agent, or Firm—Wong, Cabello, Lutsch,
`Rutherford & Brucculeri, LLP
`(57)
`ABSTRACT
`
`(73) Assignee: BMC Software, Inc., Houston, ‘IX
`(US)
`;
`oo,
`;
`Subjectto any disclaimer, the term ofthis
`patent is extended or adjusted under 35
`U.S.C. 154(b) by 0 days.
`
`;
`(*) Notice:
`
`A system and method for estimating statistics concerning
`system metrics to provide for the accurate and efficient
`monitoring of one or more computer systems. The system
`preferably comprises a distributed computing environment,
`iLe., an enterprise, which comprises a plurality of intercon-
`nected computer systems. At
`least one of the computer
`systems is an agent computer system which includes agent
`software and/or system software for the collection of data
`relating to one or more metrics, 1.e., measurements of system
`(21) Appl. No.: 09/287,601
`resources. Metric data is continually collected over the
`course of a measurement interval, regularly placed into a
`(22)
`Filed:
`Apr. 7, 1999
`registry
`of metrics, and
`then periodically
`sampled
`from the
`7
`gisiry of
`metri
`d then
`periodically
`sampled from
`th
`(SL)
`Tint, C17
`ceeccecccccsseeessseeesessseeeesssees GO06F 19/00
`registry indirectly. Sampling-related uncertainty and inac-
`(52) US. CL.
`702/186; 709/224; 709/226
`curacy arise from two primary sources:
`the unsampled
`(58) Field of Search— ,
`702/186: FIA/L:
`residual segments of seen (i.e., sampled and_therefore
`2,
`:
`709/224, 226
`known) events, and unseen (i.e., unsampled and therefore
`unknown) events. The total unsampled utilization and the
`total unseen utilization are accurately estimated according to
`the properties of one or more process service time distribu-
`tions. The total unseen utilization is also estimated with an
`iterative method using gradationsofthe sample interval. The
`length distribution of the unseen processes is determined
`with the same iterative method.
`
`(56)
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`5,655,081 A *
`8/1997 Bonnell et ale
`sesssscese 709/202
`5,696,701 A * 12/1997 Burgess et ale vacssesseee 714/25
`5,761,091 A *
`6/1998 Agrawalctal. .....
`vee 702/186
`5,796,633 A *
`8/1998 Burgess et al. .......... 702/187
`5,920,719 A *
`7/1999 Sutton et al.
`.....
`we 717/130
`6,269,401 B1 *
`7/2001 Fletcher et al.
`....0000... 709/224
`
`75 Claims, 18 Drawing Sheets
`
`Google Exhibit 1044
`Google Exhibit 1044
`Google v. Valtrus
`Google v. Valtrus
`
`
`
`
`
`
`Has
`
`
`sample interval
`
`
`A expired?
`704
`Yes
`
`Collect raw performance
`data at a high frequency
`700
`
`Store and/or update raw
`datain registry of metrics
`702
`
`
`
`
`
`
`Sample
`registry of metrics
`706
`
`Has
`
`measurement
`interval L expired?
`708
`
`

`

`U.S. Patent
`
`Feb. 10, 2004
`
`Sheet 1 of 18
`
`US 6,691,067 BI
`
`
`
`
`
`JUSWWUOIIAUZBunnduosesiidiejuyuy
`
`BalOPIM
`
`YIOMION
`
`ZOL
`
`yev0}
`
`yIOMJON
`
`
`
`Bally|e00
`
`FATATTOTATETET
`
`r=a=
`
`
`
`
`
`9@ZLG2ZLeZcLSLL9bt
`
`
`
`
`
`
`
`b‘Old
`
`
`
`
`
`

`

`U.S. Patent
`
`Feb. 10, 2004
`
`Sheet 2 of 18
`
`US 6,691,067 B1
`
`1
`
`60
`
`
`
`|
`
`—_
`
`L\-
`
`150
`
`
`
`
`
`~
`
`
`
`—
`
`
`
`
`
`FIG. 2
`
`

`

`U.S. Patent
`
`Feb. 10, 2004
`
`Sheet 3 of 18
`
`US 6,691,067 B1
`
`Jezijensi,
`
`OlP
`
`JOIPSd
`
`80P
`
`azAjeuy
`
`907
`
`IND}PellOD
`
`vor
`
`JO]UO
`
`OV
`
`M@IAIBAO
`
`©Old
`
`
`
`
`
`$10}09||09e1eq
`
`vO€e
`
`juaby
`
`ZOE
`
`
`
`OPSponsJosu0D
`
`O€Sporjusby
`
`esdie}uy
`
`juswabeueyy
`
`08}Wejsis
`
`
`
`
`
`

`

`U.S. Patent
`
`Feb. 10, 2004
`
`Sheet 4 of 18
`
`US 6,691,067 BI
`
`JO}UOW
`
`9|OSuo)
`
`JO}UOY|
`
`9|0SUOD
`
`
`
`MBIAIBAQCJOWUO/Y
`
`Ov
`
`anand
`
`uonejsiboyOcr
`
`toe
`
`|eqUMOPIUG|+eyeqydeig|—eeJ
`
` aisibonyanak:|_SisombeyWaly|soyUoyy
` UMOp|||!sisondeyydes|Gaiam”Sire2=4sjsonbayayepdr)|Z0P|saloljoSuey|'Sue!
`
`
`uowseguabeuey|4,s}sanbay
`
`TTTT
`
`SUUEIY|
`
`vSls
`
`
`
`
`
`
`
`

`

`U.S. Patent
`
`Feb. 10, 2004
`
`Sheet 5 of 18
`
`US 6,691,067 BI
`
`||
`
`
`
` (eT|fuoysodayjequeg|t-te
`
`yusby
`
`8|OSUOd
`
`
`
`ajepd||sisanboyyelgwaBy|sjsenbayLalypoeeeaten4
`
`a_—_(DYNO)AnsiBay331109|OF
`
`
`BYEXIWOUe}eq392102eqXWNeyedWVejeqwaysks!JeceJ0}99)}09Jo}93)}09Jo}93]}09JO}O9}105J0}93||09J0}99||09|iejeqesegks
`
`
`
`
`
`
`
`
`jnduanand)jndu|anandjndu|anantyyndu|anantjnduyanenyynduy||82cEPece9¢2eGeceBCE||anenr
`
`
`
`
`
`
`
`
`
`
`
`
`en,I,|ns!Oealeorerezieoe|(Pl
`UOWSEBIIAIESs}senbayydess||fuantyuoisianJuaby|sjsanbay
`
`
`
`
`
`
`
`
`
`2202
`
`
`
`s}sanbayU0i}99/]09
`
`
`sjsanbeyUMOp|G
`~eeLe
`
`
`
`Ove
`
`
`
`UOWESdIAIES
`
`qz0¢
`
`GSls
`
`swelBord/s}duasssp
`
`
`suojeoddypeyyy
`
`92
`
`vee
`
`
`
`
`
`
`
`

`

`U.S. Patent
`
`Feb. 10, 2004
`
`Sheet 6 of 18
`
`US 6,691,067 BI
`
`Jeziensip
`
`alld
`
`(siA’)
`
`POLY
`
`Sil4[APO/\
`
`(puu’)
`
`2997
`
`
`
`DEOPJONazv|euy
`
`
`
`oedgall4IND
`
`
`
`(ue’)(inBue’)
`
`yorCOP
`
`ozAjeuy
`
`907
`
`9‘SIA
`
`
`
`MIIAIOACazAjeuy
`
`suoday
`
`ec/P
`
`
`
`JO}IPSIxeL
`
`L9P
`
`INDezAjeuy
`
`09¢
`
`
`
`|S@WENSse0014
`
`
`
`‘InduJaspj
`
`
`
`SOWENJaSh
`
`
`
`
`
`

`

`U.S. Patent
`
`Feb. 10, 2004
`
`Sheet 7 of 18
`
`US 6,691,067 BI
`
`oll
`
`
`q0Z¢gop
`
`
`si)(pur)
`
`lezjensiall4ePOW\
`
`
`
`MBIAIBAOJOIPAld
`
`sasq)|eeeeeeeEee
`
`UOHeINBYUO|nduy sabueyguoneinByuo|SUOI}SLIOD
`soUeUsISYMOIQ
`
`
`
`
`
`
`
`
`SilISPOW|
`
`(pu)
`
`og9p
`
`aJeEMPJEH
`
`
`
`(muy’)sails
`
`697
`
`ZSls
`
`
`
`
`

`

`U.S. Patent
`
`Feb. 10, 2004
`
`Sheet 8 of 18
`
`US 6,691,067 BI
`
` Collect raw performance
`
`
`102
`
`
`
`data at a high frequency
`£00
`
`
`Store and/or update raw
`data in registry of metrics
`
` Has
`
`
`
`sampleinterval
`A expired?
`
`704
`
`registry of metrics
`£06
`
`
`
`
`
`Has
`
`
`measurement
`
`interval L expired?
`
`
` Sample
`
`708
`
`FIG. 8
`
`

`

`US 6,691,067 B1 my
`
`Time
`

`—
`(9
`
`=L
`
`L
`
`=
`wy
`
`x o
`
`7
`ae
`o
`
`Time
`
`U.S. Patent
`
`Feb. 10, 2004
`
`Sheet 9 of 18
`
`my sees
`
`az
`
`==
`
`AN
`
`z 7= <
`
`j
`
`x
`=
`wy
`
`5w“
`
`/
`oO
`
`Nc
`
`o
`
`ee rn An
`S
`
`.
`wWooct
`
`/
`o
`w_
`co
`
`QS coe
`
`o

`LL
`
`

`

`U.S. Patent
`
`Feb. 10, 2004
`
`Sheet 10 of 18
`
`US 6,691,067 B1
`
`Determineatotal
`Determine the
`uncaptured utilization U,,
`measurementinterval Z
`£20
`138
`
`Determine
`
`722
`
`Determine a total
`unseen utilization U,,
`
`FIG. 11
`
`Determine one or more
`(quantity d) process service
`time distributions
`£40
`
`Determine a quantity n,, of
`seen processes which follow
`eachdistribution 7
`£42
`
`Determine a mean residual
`time *; for each distribution 7
`144
`
`FIG. 12
`
`

`

`U.S. Patent
`
`Feb. 10, 2004
`
`Sheet 11 of 18
`
`US 6,691,067 B1
`
`Determine the processservice time
`distribution
`£50
`
`Determine the quantity x, of seen
`processes whichfollow this distribution
`{82
`
`Determine
`
`G,(r)=P(R <r)|X >t)=
`
`P(tt<x<t+r)
`P(X >t)
`
`15
`
`754
`
`Determine
`A
`
`7 = [rdG.(r)
`
`0
`
`End
`
`FIG. 13
`
`

`

`U.S. Patent
`
`Feb. 10, 2004
`
`Sheet 12 of 18
`
`US 6,691,067 B1
`
`Determine the process
`service time distribution to be
`an exponential distribution
`with service rate 1
`760
`
`Determine the quantity ,, of
`seen processes whichfollow
`this exponential distribution
`162
`
`
`
`Determine
`
`| Ll. as= A a
`7
`( Fae
`r
`
`164
`
`Determine the process
`service time distribution to be
`a uniform distribution
`between zero and C
`780
`
`Determine the quantity , of
`seen processes which follow
`this uniform distribution
`182
`
`r=
`
`Determine
`
`min(C —t, A)
`2
`
`£84
`
`FIG. 14
`
`FIG. 15
`
`

`

`U.S. Patent
`
`Feb. 10, 2004
`
`Sheet 13 of 18
`
`US 6,691,067 BI
`
`
`
`Determine the process
`service time distribution to be
`an unknown distribution
`
`
`
`
`
`
`
`820
`
`
`
`Determine the quantity x, of
`seen processes which follow
`
`this unknowndistribution
`822
`
`
`
`Determine
`
`
`Ss; —6,)
`ic CP
`
`Determine the process
`service time distribution to be
`an unknowndistribution
`800
`
`
`
`
`Determine the quantity 7, of
`seen processes which follow
`this unknown distribution
`802
`
`
`
`¥ max[0,(s, —5,)]
`
`
`
`Determine
`
`FIG. 16
`
`FIG. 17
`
`

`

`U.S. Patent
`
`Feb. 10, 2004
`
`Sheet 14 of 18
`
`US 6,691,067 B1
`
`
`
`Determine a total captured
`utilization U, as the sum ofall
`sampled lengths of all seen
`processes over the
`measurementinterval L
`
`840
`
`
`
`
`
`
`
`
`
`
`
`
`
`Determine a total measured
`utilization U,
`842
`
`
`
` Determine
`U = U,, —U, —U,,
`
`844
`
`FIG. 18
`
`

`

`U.S. Patent
`
`Feb. 10, 2004
`
`Sheet 15 of 18
`
`US 6,691,067 B1
`
`“ LI 7— {|||mi%
`
`my
`n
`
`n
`
`mel m2
`n
`n
`
`n
`
`27-2
`n
`
`2m -1
`n
`
`Hit¢:
`
`|
`
`(¢-N—
`
`|
`
`| |
`
`“N41
`
`wee
`
`|
`
`n
`
`|
`
`ping 1-1
`
`|
`
`fn
`
`|
`
`wen [| || a||
`
`(n-1)"——
`
`A
`
`(n-1) 41
`
`nH
`
`m-2
`
`m—1
`
`FIG. 19
`
`

`

`U.S. Patent
`
`Feb. 10, 2004
`
`Sheet 16 of 18
`
`US 6,691,067 B1
`
`
`
`
`
`
`
`

`

`U.S. Patent
`
`Feb. 10, 2004
`
`Sheet 17 of 18
`
`US 6,691,067 B1
`
` Create m x n buckets
`
`860
`
`
`Count the processesin the
`current bucket
`
`
`Place each seen processinto
`the appropriate bucket
`
`862
`
`
`
`Start with the bucket with the
`864
`longest seen process(es)
`
`
`
`866
`
`
`
`Subtract the fraction of
`
`
`longer processesthat landed
`in this bucket
`
`868
`
`
`
`Multiply by m, the number of
`buckets per sampleinterval A
`870
`
`
` Estimate the number of
`872
`
`
` Is this
`
`Descend to the
`
`
`
`
`bucketat the next
`the lowest-ranked
`
`
`lower rank
`bucket?
`
`
`
`876
`874
`
`unseen processes
`
`
`
`FIG. 21
`
`

`

`U.S. Patent
`
`Feb. 10, 2004
`
`Sheet 18 of 18
`
`US 6,691,067 B1
`
`U63) —
`
`(d_-
`
`i i-l €ax)
`t:(1—€ne.) + > (m—(k+))f,
`
`U.,,
`
`k=0
`
`FIG. 22
`
`Osc) —
`
`m—(J+I)))f,
`
`Y MS,
`i(1—e,,,)+ ¥(m—(k+D)f,
`
`U,,
`
`FIG. 23
`
`

`

`US 6,691,067 B1
`
`1
`ENTERPRISE MANAGEMENT SYSTEM AND
`METHOD WHICH INCLUDESSTATISTICAL
`RECREATION OF SYSTEM RESOURCE
`USAGE FOR MORE ACCURATE
`MONITORING, PREDICTION, AND
`PERFORMANCE WORKLOAD
`CHARACTERIZATION
`
`BACKGROUND OF THE INVENTION
`
`10
`
`1. Field of the Invention
`
`2
`mines the usefulness of the performance model for system
`capacity planning. ‘he degree of reliability also determines
`the usefulness of the performance statistics presented to
`end-users by performancetools.
`Sensitivity to sampling frequency varies among data
`types. Performance data can be classified into three catego-
`ries: cumulative, transient, and constant. Cumulative data is
`data that accumulates over time. For example, a system CPU
`time counter may collect the total number of secondsthat a
`processor has spent in system state since system boot. With
`transient data, old data is replaced by new data. For example,
`the amount of free memory is a transient metric which is
`updated periodicallyto reflect the amount of memory notin
`use. However, values such as the mean, variance, and
`standard deviation can be computed based on a sampling
`history ofthe transient metric. The third type of performance
`data, constant data, does not change over the measurement
`interval or lifetime of the event. For example, system
`configuration information, process ID, and processstart time
`are generally constant values.
`Of the three data types, transicnt performance metrics are
`the mostsensitive to variations in the sample interval and are
`therefore the mostlikely to be characterized by uncertainty.
`For example, with infrequent sampling, some state changes
`may be missed completely. However, cumulative data may
`also be rendered uncertain by infrequent sampling, espe-
`cially with regard to the variance of such a metric. Clearly,
`then, uncertainty of data caused by infrequent sampling can
`cause serious problemsin performance modeling. Therefore,
`the goal is to use sampling to capture the essence of the
`system state with a sufficient degree of certainty.
`Nevertheless, frequent sampling is usually not a viable
`option because of the heavy resource usage involved.
`For the foregoing reasons, there is a need for data col-
`lection and analysis tools and methods that accurately and
`efficiently reflect system resource usage at a lower sampling
`frequency.
`
`SUMMARYOF THE INVENTION
`
`The present invention is directed to a system and method
`that meet the needs for more accurate and efficient moni-
`
`toring and prediction of computer system performance. In
`the preferred embodiment, the system and method are used
`in a distributed computing environment, i.e., an enterprise.
`The enterprise comprisesa plurality of computer systems, or
`nodes, which are interconnected through a network.Atleast
`one of the computer systems is a monitor computer system
`from which a user may monitor the nodes of the enterprise.
`At least one of the computer systems is an agent computer
`system. An agent computer system includes agent software
`and/or system software that permits the collection of data
`relating to one or more metrics, 1.e., measurements of system
`resources on the agent computer system. In the preferred
`embodiment, metric data is continually collected at a high
`frequency over the course of a measurement interval and
`placed into a registry of metrics. The metric data is not used
`directly but rather is routincly sampled at a constant sample
`interval from the registry of metrics. Because sampling uses
`substantial system resources, sampling is preferably per-
`formed at a lesser frequency than the frequency of collec-
`tion.
`
`Sampled metric data can be used to build performance
`models for analysis and capacity planning. However, less
`frequent sampling can result in inaccurate models and data
`uncertainty, especially regarding the duration of events or
`processes and the number of events or processes. The
`
`15
`
`30
`
`35
`
`The present invention relates to the collection, analysis,
`and management of system resource data in distributed or
`enterprise computer systems, and particularly to the more
`accurate monitoring of the state of a computer system and
`more accurate prediction of system performance.
`2. Description of the Related Art
`The data processing resources of business organizations
`are increasingly taking the form of a distributed computing ,
`environment in which data and processing are dispersed
`over
`a network comprising many interconnected,
`heterogeneous, geographically remote computers. Such a
`computing environment
`is commonly referred to as an
`enterprise computing environment, or simply an enterprise.
`Managersof the enterprise often employ software packages
`known as enterprise management systems to monitor,
`analyze, and manage the resources of the enterprise. Enter-
`prise management systems mayprovide for the collection of
`measurements, or metrics, concerning the resources of indi-
`vidual systems. For example, an enterprise management
`system might
`include a software agent on an individual
`computer system for the monitoring of particular resources
`such as CPU usage or disk access. U.S. Pat. No. 5,655,081
`discloses one example of an enterprise managementsystem.
`In a sophisticated enterprise management system, tools
`for the analysis, modeling, planning, and prediction of
`system resource utilization are useful for assuring the sat-
`isfactory performance of one or more computer systems in
`the enterprise. Examples of such analysis and modeling
`tools are the “ANALYZE” and “PREDICT” components of
`“BEST/1 FOR DISTRIBUTED SYSTEMS”available from
`BMCSoftware, Inc. Such tools usually require the input of
`periodic measurements of the usage of resources such as
`central processing units (CPUs), memory, hard disks, net-
`work bandwidth, and the like. To ensure accurate analysis
`and modeling, therefore, the collection of accurate perfor-
`mance data is critical.
`
`40
`
`45
`
`Many modernoperating systems, including “WINDOWS
`NT”and UNIX,are capable of recording and maintaining an
`cnormous amount of performance data and other data con-
`cerning the state of the hardware and software of a computer
`system. Such data collection is a key step for any system
`performance analysis and prediction. The operating system
`or system software collects raw performancedata, usually at
`a high frequency, stores the data in a registry of metrics, and
`then periodically updates the data. In most cases, metric data
`is not used directly, but is instead sampled from theregistry.
`Sampling at a high frequency, however, can consume sub-
`stantial system resources such as CPUcycles, storage space,
`and I/O bandwidth. Therefore, it is impractical to sample the
`data at a high frequency. On the other hand,
`infrequent
`sampling cannot capture the complete system state:
`for
`example, significant short-lived events and/or processes can
`be missed altogether. Infrequent sampling may therefore
`distort a model of a system’s performance. The degree to
`which the sampled data reliably reflects the raw data deter-
`
`50
`
`55
`
`60
`
`65
`
`

`

`US 6,691,067 B1
`
`3
`present invention is directed to reducing said uncertainty.
`Uncertainty arises from two primary sources: the unsampled
`segment of a seen process or event, and the unseen process
`or event. A seen processis a process that is sampledat least
`once; therefore, its existence and starting time are known.
`However, the residual time or utilization between the last
`sampling of the process or event and the death of the process
`or the termination of the event is unsampled and unknown.
`An unseen processis shorter than the sample interval and is
`not sampled at all, and therefore its entire utilization is
`unknown. Nevertheless, the total unsampled(i.e., residual)
`utilization and the total unseen utilization can be estimated
`with the system and method of the present invention.
`In determining the total unsampledutilization, a quantity
`of process service time distributions are determined, and
`each of the seen processes are assigned respective process
`service time distributions. For each distribution, a mean
`residual time is calculated using equations provided by the
`system and method. The total unsampled utilization is the
`sum of the mean residual time multiplied by the numberof
`seen processes for each distribution, all divided by the
`measurementinterval.
`
`In determining the total unseen utilization, first the total
`captured utilization is determined to be the sum of the
`sampled utilizations of all seen processes over the measure-
`ment interval. Next the total measured utilization, or the
`“actual” utilization over
`the measurement
`interval,
`is
`obtained from the system software or monitoring software.
`The difference between the total measured utilization and
`
`the total captured utilization is the uncertainty. Because the
`uncertainty is due to either unsampled segments or unseen
`events, the total unseen utilization is calculated to be the
`uncertainty (the total measured utilization minus the total
`captured utilization) minus the total unsampled utilization.
`Whenthe total measured utilization is not available, the
`total unseen utilization is estimated with an iterative bucket
`method. A matrix of buckets are created, wherein each row
`corresponds to the sample interval and each bucket to a
`gradation of the sample interval. Each processis placed into
`the appropriate bucket according to how many times it was
`sampled and when in the sample interval it began. Starting
`with the bucket with the longest process(es) and working
`iteratively back through the other buckets, the number of
`unseen processes are estimated for each length gradation of
`the sample interval. The iterative bucket methodis also used
`to determine a length distribution of unseen processes.
`In response to the determination of utilizations described
`above, the system and method are able to use this informa-
`tion in modeling and/or analyzing the enterprise. In various
`embodiments, the modeling and/or analyzing may further
`comprise one of more of the following: displaying the
`determinations to a user, predicting future performance,
`graphing a pertormance prediction, generating reports, ask-
`ing a user for further data, permitting a user to modify a
`model of the enterprise, and altering a configuration of the
`enterprise in response to the determinations.
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`A better understanding of the present invention can be
`obtained when the following detailed description of the
`preferred embodimentis considered in conjunction with the
`following drawings, in which:
`FIG. 1 is a network diagram ofanillustrative cntcrprisc
`computing environment;
`FIG. 2 is an illustration of a typical computer system with
`computer software programs;
`
`10
`
`15
`
`30
`
`35
`
`40
`
`50
`
`55
`
`60
`
`65
`
`4
`FIG. 3 is a block diagram illustrating an overview of the
`enterprise management system according to the preferred
`embodimentof the present invention;
`FIG. 4 is a block diagram illustrating an overview of the
`Monitor component of the enterprise management system
`according to the preferred embodimentof the present inven-
`tion;
`FIG. 5 is a block diagram illustrating an overview of the
`Agent component of the enterprise management system
`according to the preferred embodimentof the present inven-
`tion;
`FIG. 6 is a block diagram illustrating an overview of the
`Analyze component of the enterprise management system
`according to the preferred embodimentof the present inven-
`tion;
`FIG. 7 is a block diagram illustrating an overview of the
`Predict component of the enterprise management system
`according to the preferred embodimentof the present inven-
`tion;
`FIG. 8 is a flowchart illustrating an overview of the
`collection and sampling of metric data;
`FIG. 9 is a diagram illustrating an unsampled segment of
`a seen event;
`
`FIG. 10 is a diagram illustrating an unseen event;
`FIG. 11 is a flowchart illustrating an overview of the
`eslimalion of metric dala statistics;
`FIG. 12is a flowchart illustrating the determination of the
`total uncaptured utilization;
`FIG. 13 is a flowchart further illustrating the determina-
`tion of the total uncaptured utilization;
`FIG. 14is a flowchart illustrating the determination of the
`portion of the total uncaptured utilization for an exponential
`distribution;
`FIG. 15 is a flowchart illustrating the determination of the
`portion of the total uncaptured utilization for a uniform
`distribution;
`FIG. 16is a flowchart illustrating the determination of the
`portion of the total uncaptured utilization for an unknown
`distribution;
`FIG. 17 is a flowchartillustrating an alternative method of
`the determination of the portion of the total uncaptured
`utilization for an unknowndistribution;
`FIG. 18 is a flowchart illustrating the determination of the
`total unseen utilization;
`FIG. 19 illustrates a matrix of buckets used in the esti-
`mation of the total unseen utilization;
`FIG. 20 illustrates a specific example of the estimation of
`the total unseen utilization with buckets;
`FIG. 21 is a flowchart illustrating the iterative bucket
`method of estimating the total unseen utilization;
`FIGS. 22 and 23 are equations which are used to generate
`a length distribution of the unseen processes.
`DETAILED DESCRIPTION OF THE
`PREFERRED EMBODIMENT
`
`U.S. Pat. No. 5,655,081 titled “System for Monitoring and
`Managing Computer Resources and Applications Across a
`Distributed Environment Using an Intelligent. Autonomous
`Agent Architecture” is hereby incorporated by reference as
`though fully and completely sct forth hercin.
`U.S. Pat. No. 5,761,091 titled “Method and System for
`Reducing the Errors in the Measurements of Resource
`Usage in Computer System Processes and Analyzing Pro-
`
`

`

`US 6,691,067 B1
`
`5
`cess Data with Subsystem Data” is hereby incorporated by
`reference as though fully and completely set forth herein.
`FIG. 1 illustrates an cntcrprisc computing cnvironment
`according to one embodiment of the present invention. An
`enterprise 100 comprises a plurality of computer systems
`which are interconnected through one or more networks.
`Although one particular embodimentis shownin FIG.1, the
`enterprise 100 may comprise a variety of heterogeneous
`computer systems and networks whichare interconnected in
`a variety of ways and which run a variety of software
`applications.
`One or more local area networks (LANs) 104 may be
`includedin the enterprise 100. A LAN 104is a network that
`spans a relatively small area. Typically, a LAN 104 is
`confined to a single building or group of buildings. Each
`node(i.e., individual computer system or device) on a LAN
`104 preferably has its own CPU with which it executes
`programs, and each node is also able to access data and
`devices anywhere on the LAN 104. The LAN 104 thus
`allows many users to share devices(e.g., printers) as well as
`data stored on file servers. The LAN 104 may be charac-
`terized by any of a variety of types of topology (ie., the
`geometric arrangement of devices on the network), of pro-
`tocols (i.e., the rules and encoding specifications for sending
`data, and whether the network uses a peer-to-peeror client/
`server architecture), and of media (e.g., twisted-pair wire,
`coaxial cables, fiber optic cables, radio waves). As illus-
`trated in FIG. 1, the enterprise 100 includes one LAN 104.
`However, in alternate embodiments the enterprise 100 may
`include a plurality of LANs 104 which are coupled to one
`another through a wide area network (WAN) 102. A WAN
`102 is a network that spans a relatively large geographical
`area.
`
`Each LAN 104 comprises a plurality of interconnected
`computer systems and optionally one or more other devices:
`for example, one or more workstations 110@, one or more
`personal computers 1124, one or more laptop or notebook
`computer systems 114, one or more server computer systems
`116, and one or more network printers 118. As illustrated in
`FIG. 1, the LAN 104 comprises one of each of computer
`systems 110a, 112a, 114, and 116, and one printer 118. The
`LAN 104 may be coupled to other computer systems and/or
`other devices and/or other LANs 104 through a WAN 102.
`One or more mainframe computer systems 120 may
`optionally be coupled to the enterprise 100. As shown in
`FIG. 1, the mainframe 120 is coupled to the enterprise 100
`through the WAN 102, but alternatively one or more main-
`frames 120 may be coupled to the enterprise 100 through
`one or more LANs 104. As shown, the mainframe 120 is
`coupled to a storage deviceorfile server 124 and mainframe
`terminals 122a, 122b, and 122c. The mainframe terminals
`122a, 122b, and 122c access data stored in the storage
`device or file server 124 coupled to or comprised in the
`mainframe computer system 120.
`The enterprise 100 may also comprise one or more
`computer systems which are connected to the enterprise 100
`through the WAN 102: as illustrated, a workstation 1105 and
`a personal computer 1125. In other words,the enterprise 100
`may optionally include one or more computer systems
`which are not coupled to the enterprise 100 through a LAN
`104. For example, the enterprise 100 may include computer
`systems which are geographically remote and connected to
`the enterprise 100 through the Internet.
`The present
`invention preferably comprises computer
`programs 160 stored on or accessible to each computer
`system in the enterprise 100. FIG. 2 illustrates computer
`
`6
`programs 160 and a typical computer system 150. Each
`computer system 150 typically comprises components such
`as a CPU 152, with an associated memory media. The
`memory media stores program instructions of the computer
`programs 160, wherein the programinstructions are execut-
`able by the CPU 152. The memory media preferably com-
`prises a system memory such as RAM and/or a nonvolatile
`memory such as a hard disk. The computer system 150
`further comprises a display device such as a monitor 154, an
`alphanumeric input device such as a keyboard 156, and
`optionally a directional input device such as a mouse 158.
`The computer system 150 is operable to execute computer
`programs 160.
`When the computer programs are executed on one or
`more computer systems 150, an enterprise management
`system 180 is operable to monitor, analyze, and manage the
`computer programs, processes, and resources of the enter-
`prise 100. Each computer system 150 in the enterprise 100
`executes or runs a plurality of software applications or
`processes. Each software application or process consumes a
`portion of the resources of a computer system and/or net-
`work:
`for example, CPU time, system memory such as
`RAM,nonvolatile memory such as a hard disk, network
`bandwidth, and input/output (I/O). The enterprise manage-
`ment system 180 permits users to monitor, analyze, and
`manage resource usage on heterogeneous computer systems
`150 across the enterprise 100.
`FIG. 3 shows an overview of the enterprise management
`system 180. The enterprise management system 180
`includesat least one console node 400 andatleast one agent
`node 300, but it may include a plurality of console nodes 400
`and/or a plurality of agent nodes 300. In general, an agent
`node 300 executes software to collect metric data on its
`
`computer system 150, and a console node 400 executes
`software to monitor, analyze, and manage the collected
`metrics from one or more agent nodes 300. A metric is a
`measurement of a particular system resource. For example,
`in the preferred embodiment,
`the enterprise management
`system 180 collects metrics such as CPU, disk I/O, file
`system usage, database usage, threads, processes, kernel,
`registry, logical volumes, and paging. Each computer system
`150 in the cnterprisc 100 may comprise a console node 400,
`an agent node 300, or both a console node 400 and an agent
`node 300 in the preferred embodiment, server computer
`systems include agent nodes 300, and other computer sys-
`tems may also comprise agent nodes 300asdesired,e.g., file
`servers, print servers, e-mail servers, and internet servers.
`The console node 400 and agent node 300 are characterized
`by an end-by-endrelationship: a single console node 400
`maybelinked to a single agent node 300, or a single console
`node 400 maybelinked to a plurality of agent nodes 300,or
`a plurality of console nodes 400 may be linked to a single
`agent node 300, or a plurality of console nodes 400 may be
`linked to a plurality of agent nodes 300.
`In the preferred embodiment, the console node 400 com-
`prises four user-visible components: a Monitor component
`402, a Collect graphical user interface (GUI) 404, an Ana-
`lyze component 406, and a Predict component 408. In one
`embodiment, all four components 402, 404, 406, and 408 of
`the console node 400 are part of the “BEST/1 FOR DIS-
`TRIBUTED SYSTEMS” software package or
`the
`“PATROL” software package, all available from BMC
`Software, Inc. The agent node 300 comprises an Agent 302,
`one or more data collectors 304, Universal Data Repository
`(UDR)history files 210a, and Universal Data Format (UDF)
`history files 212a. In alternate embodiments, the agent node
`300 includes either of UDR 210a or UDF 2122, but not both.
`
`10
`
`15
`
`30
`
`45
`
`50
`
`55
`
`60
`
`65
`
`

`

`US 6,691,067 B1
`
`7
`The Monitor component 402 allows a user to monitor, in
`real-time, data that is being collected by an Agent 302 and
`being sent to the Monitor 402. The Collect GUI 404 is
`employed to schedule data collection on an agent node 302.
`The Analyze component 406 takes historical data from a
`UDR 2104 and/or UDF 212a to create a model of the
`enterprise 100. The Predict component 408 takes the model
`from the Analyze component 406 and allowsa userto alter
`the model by specifying hypothetical changes to the enter-
`prise 100. Analyze 406 and Predict 408 can create output in
`a format which can be understood and displayed by a
`Visualizer tool 410. In the preferred embodiment, Visualizer
`410 is the “BEST/1-VISUALIZER”available from BMC
`Software, Inc. In one embodiment, Visualizer 410 is also
`part of the console node 400.
`The Agent 302 controls data collection on a particular
`computer system and reports the data in real time to one or
`more Monitors 402. In the preferred embodiment, the Agent
`302 is the part of the “BEST/1 FOR DISTRIBUTED SYS-
`TEMS”software package available from BMC Software,
`Inc. The data collectors 304 collect data from various
`
`processes and subsystemsof the agent node 300. The Agent
`302 sends real-time data to the UDR 210a, which is a
`database of historical data in a particular data format. The
`UDF 212ais similar to the UDR 210a, but the UDF 212a
`uses an alternative data format and is written directly by the
`data collectors 304.
`
`FIG. 4 showsan overview of the Monitor component 402
`of the console node 400 of the enterprise management
`system 180. The Monitor 402 comprises a Manager Daemon
`430, one or more Monitor Consoles(asillustrated, 420a and
`420b), and a Policy Registration Queue 440. Although two
`Monitor Consoles 420a and 420b are shownin FIG. 4, the
`present invention contemplates that one or more Monitor
`Consoles may be executing on any of one or more console
`nodes 400.
`
`In the preferred embodiment, the Monitor Consoles 420a
`and 420b use a graphical user interface (GUDfor user input
`and information display. Preferably, the Monitor Consoles
`420a and 420b are capable of sending severaldifferent types
`of requests to an Agent 302, including: alert requests, update
`requests, graph requests, and drilldown requests. An alert
`request specifies one or more thresholds to be checked on a
`routine basis by the Agent 302 to detect a problem on the
`agent node 300. For example, an alert request might ask the
`Agen

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket