`Agrawalet al.
`
`0
`
`(11) Patent Number:
`(45) Date of Patent:
`
`5,761,091
`Jun. 2, 1998
`
`5,072,376 L2A9OL Ellsworth csesesucsscceeeuseeseeesee 395/650
`METHOD AND SYSTEM FOR REDUCING
`[54]
`....-sssecssssersssserrees 395/750
`5,355,501
`10/1994 Gross et al.
`
`THE ERRORS IN THE MEASUREMENTS OF
`5,499,340—3/1996 BarritZ ..sscrcessssssrrsentsesenesee 395/184.01
`
`RESOURCE USAGE IN COMPUTER
`9/1996 Uyarna eamecsscssssssnssesssseansern 395/700
`5.560.011
`SYSTEM PROCESSES AND ANALYZING
`5,590,056 12/1996 Bartitz --ssecesesessssasseenseenesen 364/550
`PROCESS DATA WITH SUBSYSTEM DATA
`
`[75]
`
`Inventors: Subhash C. Agrawal. Lincoln;
`Kenneth Newman, Cambridge; Carol
`Rathrock. Lexington, all of Mass.
`
`[73]
`
`Assignee: BGS Systems, Inc., Waltham. Mass.
`
`[21}
`
`[22]
`
`($1)
`[52]
`[58}
`
`[56]
`
`Appl. No.: 763,187
`Filed:
`Dec. 10, 1996
`
`Tint. CIS oicccccccssccssscsesssessssnssesseseccenneneeensasena GOLB 17/00
`ULS. Ch.
`casssssssseseseseeess 364/551.01; 364/550; 395/674
`Field of Search ..cccccsssesscccscsenen 364/550. 551.01.
`364/569; 395/183.14, 184.01. 180, 200.54.
`200.55, 200.56, 670. 672. 673. 674
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`
`Primary Examiner—James P. Trammell
`Attorney, Agent, or Firm—Rines and Rines
`[57]
`ABSTRACT
`
`A novel method of and system and procedures for more
`accurately measuring the resource usage of UNIX processes
`by sampling methods involving appropriate corrections for
`the resource usage ofthe terminated processes and analyzing
`UNIX process data along with subsystem data such as
`RDBMSs.allowing system administrators and mangers to
`get a muchbetter picture of who is using the resources on the
`system and thus perform a better job at performance analysis
`and capacity planning,
`the technique also enabling and
`reducingthe error in the process measurements collected by
`sampling of the resources usage measured by the operating
`system and correlating the measurements taken by sub-
`systems with the measurements taken by the operating
`system.
`
`20 Claims, 9 Drawing Sheets
`3,818,458—6/1974 DeeSe wrecscercrerseccrscessensscerneates 340/172.5
`
`SELECT DATA RECORDS FOR THE GIVEN ANALYSIS INTERVAL
`
`COMPUTE SYSTEM WIDE RESOURCE USE TOTALS
`COMPUTE TOTALS OF “oO USED BY PROCESSES
`
`@)
`
`
`
`
`
`IDENTIFY WORKLOAD IT BELONGS
`FOR EACH PROCESS,
`TO, COMPUTER TOTAL RESOURCE USE FOR EACH
`WORKLOAD
`
`B/A=CAPTURE As(®) ®
`COMPUTER RATIO OF TOTALS (B) AND(A):
`BY CAPTURE RATIO*
`
`DIVIDE WORKLOAD AND PROCESS RESOURCE USE
`
`*SIMILARILY ADJUST RESOURCE USE OF PROCESSES.
`
`BASIC SYSTEM & PROCESS DATA ANALYSIS METHOD
`
`Google Exhibit 1042
`Google Exhibit 1042
`Google v. Valtrus
`Google v. Valtrus
`
`
`
`U.S. Patent
`
`Jun. 2, 1998
`
`Sheet 1 of 9
`
`5,761,091
`
`CONSUMERS
`ay USING OR WAITING
`
`ARRIVING — — —
`CONSUMERS
`
`1
`FIG.
`PRIOR ART
`
`DEPARTING
`CONSUMERS
`
`FIG. 2
`
`© DOQOOOODD
`
`TICK
`Slt Ee Lit)f12:1 12|
`
`
`
`FOR RESOURCES
`
`
` a3 GO©©O@®2COCO
`
`CPU BUSY-WHOLE TICK CHARGED TO PROCESS 1
`
`CPU BUSY-WHOLE TICK CHARGED TO PROCESS 2
`
`CPU SEEN IDLE-CPU USED BY PROCESS 2 BETWEEN
`TICKS (b)&(C) IS NOT CHARGED TO ANY PROCESS
`CPU SEEN IDLE-CPU USE BY PROCESS 3 LOST
`
`CPU SEEN BUSY-WHOLE TICK IS CHARGED TO PROCESS 1
`CPU IDLE-CPU USED BY PROCESS 1 BETWEEN (@)&(f)NOT SEEN
`
`CPU IDLE-CPU USED BY NEW PROCESS 11, A CHILD OF
`PROCESS 1
`IS NOT MEASURED AT ALL
`
`CPU SEEN BUSY-WHOLE TICK CHARGED TO PROCESS 12,
`CHILD OF PROCESS 1
`
`CPU SEEN BUSY-WHOLE TICK IS CHARGED TO PROCESS 12 IN USE
`
`PROCESS 12 TERMINATES.
`FIELD OF PROCESS 1.
`
`ITS CPU IS CHARGED TO CHILDREN
`
`
`
`US. Patent
`
`5,761,091
`
`Jun. 2, 1998
`
`Sheet 2 of 9
`
`
`
`DENOTES A PROCESS. PROCESSES LOWER IN THE HIERARCHY
`ARE OFFSPRINGS OF THE PROCESSES HIGHER IN THE
`HIERARCHY.
`
`Pabcd
`
`DENOTES THE NAME OF THE PROCESS.
`
`CONSUMER/PROCESS HIERARCHY
`
`FIG. 3
`
`e
`e
`
`e
`@
`
`PROCESS IDENTIFIER
`PROCESS’S PARENT IDENTIFIER
`
`PROCESS NAME
`LONG OR FULL COMMAND NAME
`
`¢ USER NAME
`e ACCOUNT NAME
`
`e CPU USED BY PROCESS
`e CPU USED BY TERMINATED CHILDREN OF THE PROCESS
`@
`START TIME OF THE PROCESS
`e
`TERMINATION TIME OF THE PROCESS
`
`SOME PROCESS ATTRIBUTES/METRICS
`
`FIG. 4
`
`
`
`U.S. Patent
`
`Jun. 2, 1998
`
`Sheet 3 of 9
`
`5,761,091
`
`QaldIGOW
`
`SOAa
`
`TOC
`
`
`
`SH3INNODSSI00Ud
`
`AJILNIO! (313790SVJSOHL
`GALVNINS3L403GYO03YSHLSLIM d]LS
`HLIMJaVdNOD SGNOO3S(X)AYIAI
`JYLVHLS3ISSIOOYdNIYCTIHDGALVNINYALFHLO139VSNJOUNOSIY
`
`
`
`
`NIXCIHD(G3LYNINYSL)SLINIJSVSNONISHI3LVIOTIYGNYSS200Ud
`
`
`
`YOLSIONYONTISIXGLSaldYOLNIYYdS.S3SSIIONd313790
`
`
`
`
`
`
`
`
`XYVN“LSIX]LONO0IVHLS3SS300¥dAdILN3OI
`ATALVIGSNNIS3SS300Ud
`
`STIANVSSNOIATYdNOYS3SSIIONd
`
`
`
`SHALNNODSSID0¥dTIdNVS
`
`SNOIAIYdHLNIQA/dLLNIGI
`
`
`
`SYILNNODWALSAS
`
`
`
`*NIVIZYONYSGNOOSS.¥.
`
`
`
`ASAIJIGWVS
`
`MOWSNNI-
`
`xSidNO-
`
`AUVICINYILNIOd
`
`SNOILYINdNOD
`
`HIIMINIDVUIAY“9°3)
`
`“(FIdNVSSNOIAId
`
`
`
`Fd¥OLqYOIRy
`
`“SQNOO3S«8.AYIAa
`
`
`
`
`
`S3SSIIONdGALYNINYSLL3ALONTIVYO4SGYOO3YFHL3LIUM
`
`(€=:ZATIVOIdAL)SGNODSS.Z.AY3AR
`
`
`
`WALSASSIS
`
`GOHLINNOWISTIONV1¥d
`
`G‘Old
`
`
`
`
`
`
`U.S. Patent
`
`Jun. 2, 1998
`
`Sheet 4 of 9
`
`5,761,091
`
`SELECT DATA RECORDS FOR THE GIVEN ANALYSIS INTERVAL
`
`COMPUTE SYSTEM ‘O USE TOTALS
`
`@)
`
`COMPUTE TOTALS OF “e USED BY PROCESSES
`WORKLOAD
`
`IDENTIFY WORKLOAD IT BELONGS
`FOR EACH PROCESS,
`TO, COMPUTER TOTAL RESOURCE USE FOR EACH
`
`COMPUTER RATIO OF TOTALS(B)AND (A) :
`B/A=CAPTURE RATIO
`
`DIVIDE WORKLOAD AND PROCESS RESOURCE USE
`BY CAPTURE RATIO*
`
`*SIMILARILY ADJUST RESOURCE USE OF PROCESSES.
`
`BASIC SYSTEM & PROCESS DATA ANALYSIS METHOD
`
`FIG. 6
`
`
`
`US. Patent
`
`Jun. 2, 1998
`
`Sheet 5 of 9
`
`5,761,091
`
`SELECT DATA RECORDS FOR THE GIVEN ANALYSIS INTERVAL
`
`COMPUTE SYSTEM WIDE RESOURCE USE TOTALS (A)
`
`COMPUTE TOTALS OF RESOURCES USED BY PROCESSES (B),
`AND ESTIMATE OF PROCESSES LIFESPAN
`
`COMPUTE THE DIFFERENCE BETWEEN (A) &(B): AB
`
`
`FOR EACH PROCESS, DETERMINE THE WORKLOAD IT BELONGS 10,
`ACCUMULATE TOTAL RESOURCE USE FOR THE WORKLOAD AND
`ALSO CREATE A HISTOGRAM OF LIFESPAN VS. RESOURCES USED
`
`
`USING 3 BUCKETS, E.G., SHORT LIFE PROCESSES < (X/2 SECONDS
`LONG) MEDIUM LIFE PROCESSES (X/2 TO 2X LONG)
`
`LONG LIFE LONGER THAN 2X LONG
`
`
`
` ALLOCATE A SUBSTANTIAL PORTION (C%) OF THE DIFFERENCE
`(A-B) TO WORKLOAD IN PROPORTION TO THEIR RESOURCE
`USAGE IN BUCKETS SHORT LIFE, MEDIUM LIFE AND LONG LIFE
`
`WITH RELATIVE WEIGHTS (2,1,0)
`
`
`
`ALLOCATE THE REMAINDER (100%-C%) OF THE DIFFERENCE TO
`WORKLOADS IN PROPORTION TO THEIR MODIFIED TOTAL
`RESOURCE USAGE
`
`* SIMILARLY ADJUST THE RESOURCE USE OF PROCESSES
`
`*ONE CAN ALTERNATELY ADJUST PROCESS RESOURCE USE
`TIMES BEFORE ACCUMULATING TIMES FOR WORKLOADS
`
`REFINED SYSTEM & PROCESS ANALYSIS METHOD
`
`FIG. 7
`
`
`
`USS. Patent
`
`Jun. 2, 1998
`
`Sheet 6 of 9
`
`5,761,091
`
`ORACLE PROCESSES
`narra “7
`
`|
`ORACLE
`(ORACLE
`|
`
`|\_SERVER SERVER/|
`|
`PROCESSES
`
`l|/~ORACLE ORACLE cD
`| oisParcHer)(LoGHRITER)|
`
`
`
`
`CPU USED FOR
`TRANSACTION FROM
`USER HARRY,
`RUNNING °FIN’
`APPLICATION
`AGAINST DATA-
`BASE DIV4
`
`L Resource CONSUMING ENTITIES IN AN
`ORACLE INSTANCE.
`(SESSION LEVEL
`DETAIL IGNORED HERE FOR
`SIMPLICITY)
`
`PROCESSES & SUBSYSTEMS: AN ORACLE EXAMPLE
`
`FIG. 8
`
`
`
`U.S. Patent
`
`Jun.2, 1998
`
`Sheet 7 of 9
`
`5,761,091
`
`’S’ SECONDS
`
`SAMPLE SERVER’S GLOBAL
`METRICS EVERY *R’ SECONDS
`
`ACCUMULATE METRICS
`AS APPROPRIATE
`
`RECORD METRICS EVERY
`
`SAMPLE SUBSYSTEM
`PERFORMANCE COUNTERS USER,
`SESSION DATABASE TABLES
`EVERY °K’ SECONDS
`
`ACCUMULATE RECORDS FOR
`SESSIONS THAT HAVE
`SAME KEYS
`
`RECORD METRICS EVERY
`'l’ SECONDS
`(TYPICALLY L=S)
`
`METHOD FOR COLLECTING DATA FOR
`
`SUBSYSTEMS SUCH AS ORACLE
`
`FIG. 9
`
`
`
`U.S. Patent
`
`Jun.2, 1998
`
`Sheet 8 of 9
`
`5,761,091
`
`FIG. 10
`
`
`STEP 1: ADJUST CPU UTILIZATION FOR INSTANCE PROCESSES
`eCOMPUTE OVERALL CAPTURE RATIO CR
`
`SELECT PROCESS RECORDS ASSOCIATED WITH THE SUBSYSTEM INSTANCE
`#SUM CPU UTILIZATION OF THESE PROCESS RECORDS
`
`eA=INSTANCE PROCESS CPU UTILIZATION ADJUSTED BY THE CAPTURE RATIO
`
`
`
`
`
`
`
`STEP 2: FIND SUBSYSTEM DATA
`e SELECT APPROPRIATE SUBSYSTEM RECORDS FROM DATA FOR
`
`GIVEN INTERVAL
`
`
`
`
`
`
`
`STEP 3: ADJUST SUBSYSTEM INSTANCE LEVEL UTILIZATION
`
`© COMPUTE TOTAL CPU UTILIZATION °B’ FOR THE INSTANCE AS MEASURED
`BY THE SUBSYSTEM
`© COMPUTE FIRST COMPONENT OF THE SUBSYSTEM OVERHEAD AS
`OVHD1=MIN(OVHD1_CAP, A-B)
`WHERE THE OVHD1_CAP CAN BE A FUNCTION OF TOTAL CPU UTILIZATION
`© COMPUTE ADJUSTED INSTANCE CPU UTILIZATION
`C=A-OVHD1
`
`
`
`
`
`
`
`
`
`
`
`
`
`STEP 4: ADJUST SUBSYSTEM SESSION LEVEL UTILIZATION USING INSTANCE
`LEVEL UTILIZATION
`*® COMPUTE SUM OF UTILIZATION FOR EACH SESSION
`D= E SESSION_CPU_UTILIZATION
`* COMPUTE SECOND COMPONENT OF THE SUBSYSTEM OVERHEAD
`OVHD2=MIN(OVHD2_CAP, C—D)
`WHERE OVHD2_CAP CAN BE A FUNCTION OF ADJUSTED INSTANCE
`CPU UTILIZATION
`© ADJUST CPU UTILIZATION OF SESSIONS
`ADJUSTED_SESSION_CPU_UTILIZATION=
`(C-OVHD2)/C* SESSION_CPU_UTILIZATION
`
`
`
`
`
`
`
`
`
`
`
`
`
`STEP 5: COMPUTE OVERALL OVERHEAD WORKLOAD
`TOTAL OVERHEAD UTILIZATION=OVHD1+0VHD2
`
`STEP 6: COMPUTE SUBSYSTEM WORKLOAD UTILIZATION
`e GROUP SESSIONS INTO WORKLOADS ON THE BASIS OF THER DB NAME
`OR USER NAME OR APPLICATION NAME
`¢ COMPUTE CPU UTILIZATION FOR EACH WORKLOAD BY SUMMING ADJUSTED
`
`SESSION CPU UTILIZATION
`
`METHOD FOR COMPUTING COMPONENT RESOURCES FOR SUBSYSTEM SUCH AS ORACLE
`
`
`
`USS. Patent
`
`Jun. 2, 1998
`
`Sheet 9 of 9
`
`5,761,091
`
`FIG. 11
`
`
`STEP 1: ADJUST CPU UTILIZATION FOR INSTANCE PROCESSES
`
`
`*COMPUTE OVERALL CAPTURE RATIO CR
`eSELECT PROCESS RECORDS ASSOCIATED WITH THE SUBSYSTEM INSTANCE
`
`
`eSUM CPU UTILIZATION OF THESE PROCESS RECORDS
`
`
`eA=INSTANCE PROCESS CPU UTILIZATION ADJUSTED BY THE CAPTURE RATIO
`
`STEP 2: FIND SUBSYSTEM DATA
`SELECT APPROPRIATE SUBSYSTEM RECORDS FROM DATA FOR GIVEN INTERVAL
`
`
`
`
`
`
`
`
`
`
`
`STEP 3: ADJUST SUBSYSTEM INSTANCE LEVEL UTILIZATION
`¢ COMPUTE TOTAL CPU UTILIZATION °B’ FOR THE INSTANCE AS MEASURED
`BY THE SUBSYSTEM
`e COMPUTE FIRST COMPONENT OF THE SUBSYSTEM OVERHEAD AS
`
`OVHD1=MIN(OVHD1_CAP, A-B)
`WHERE THE OVHD1_CAP CAN BE A FUNCTION OF TOTAL CPU UTILIZATION
`* COMPUTE ADJUSTED INSTANCE CPU UTILIZATION... C=A-OVHD'
`
` STEP 4:PREADJUST SUBSYSTEM SESSION LEVEL UTILIZATION USING USER
`FOR EACH USER IN THE SUBSYTEM
`LEVEL UTILIZATION
`
`
`© COMPUTE TOTAL CPU UTILIZATION "E’ AS MEASURED BY THE SUBSYSTEM
`» COMPUTE SUM OF UTILIZATION FOR EACH SESSION FOR THE USER IN
`THE SUBSYSTEM
`
`F= © SESSION_CPU_UTILIZATION
`© PREADJUST CPU UTILIZATION FOR EACH SESSION
`
`PREADJUSTED_SESSION_CPU_UTILIZATION= F/E * SESSION_.CPU_UTILIZATION
`
`
`
`
`STEP 5:ADJUST SUBSYSTEM SESSION LEVEL UTILIZATION USING INSTANCE LEVEL
`UTILIZATION
`
`
`e COMPUTE SUM OF UTILIZATION FOR EACH SESSION
`D= EY PREADJUSTED_SESSION_CPU_UTILIZATION
`
`
`© COMPUTE SECOND COMPONENT OF THE SUBSYSTEM OVERHEAD
`OVHD2=MIN(OVHD2_CAP, C-D)
`
`
`WHERE OVHD2_CAP CAN BE A FUNCTION OF ADJUSTED INSTANCE CPU UTILIZATION
`
`
`© ADJUST CPU UTILIZATION OF SESSIONS; ADJUSTED_SESSION_CPU_UTILIZATION=
`C-OVHD2)/C* SESSION_CPU_UTILIZATION
`
`
`
`
`
`
`
`STEP 6: COMPUTE OVERALL OVERHEAD WORKLOAD
`TOTAL OVERHEAD UTILIZATION=OVHD1+OVHD2
`
`
`
`STEP 7: COMPUTE SUBSYSTEM WORKLOAD UTILIZATION
`
`* GROUP SESSIONS INTO WORKLOADS ON THE BASIS OF THEIR DB NAME OR USER
`
`
`NAME OR APPLICATION NAME
`
`
`e COMPUTE CPU UTILIZATION FOR EACH WORKLOAD BY SUMMING ADJUSTED CPU
`UTILIZATION
`[FOR SUBSYSTEM SUCH AS SYBASE]
`
`
`
`
`5.761.091
`
`1
`METHOD AND SYSTEM FOR REDUCING
`THE ERRORS IN THE MEASUREMENTSOF
`RESOURCE USAGE IN COMPUTER
`SYSTEM PROCESSES AND ANALYZING
`PROCESS DATA WITH SUBSYSTEM DATA
`
`The present invention relates to the measurementof the
`usage of resources such as central processing units (CPU).
`memory, hard disks, network bandwidth and the like by the
`process aid subsystems in a computer system. Such mea-
`surements and analyses are required for assuring satisfactory
`performance of the computer systems and are complicated
`by the short-cuts taken by operating systems and other
`entities in updating underlying variables—these short-cuts
`having been implemented either to reduce the cost of
`measurement or because measurement procedures were not
`given enough attention during development.
`BACKGROUND
`
`The CPU is one of the most important resources in
`computer systems. For performance analysis capacity
`planning, chargeback and accounting functions,it is impor-
`tant to measure correctly the over-all CPU utilization as well
`as utilization by each consumer or consumer group. While
`measurement tools have been perfected for and are well
`integrated into “mature” operating systems such as MVS.
`Open VMSand OS/400, becauseofits nature of evolution
`and open development.such has not been the case for UNIX
`and other similar systems. This is generally also the case
`when an operating system is relatively “young”.
`Operational computer systems employ tightly woven
`interaction between resources and consumers. As before
`stated, resources include the central processing unit (CPU),
`memory. hard disks, and network bandwidth. The term
`“consumers’ is intended to embrace processes. transactions,
`applications and the ultimate user. When a consumer arrives
`at a resource or a server, it may have to wait for its turn for
`service. then receive service. and then go to another resource
`for additional service or depart from the system. One ofthe
`goals of the measurementtools is to measure (a) the overall
`utilization of various resources and (b) for each consumer.
`the amountof time it uses each of the resources.
`While it is generally fairly easy to measure resource
`consumption on a system-wide basis. measurement of
`resource consumption on a consumer-by-consumerbasis is
`much more difficult and resource intensive. Information on
`resource use by the consumer is needed for (a) relating the
`resource use to actual need (by whom, for what purpose or
`application) and for many performance tuning actions, ¢.g.
`reducing the priority or rate ofresource consumptionfor less
`critical work.
`Measuring computer performance has traditionally been
`more of an art than a science. Analysis must at some level
`rely on information received from the operating system.
`Brute force capturing of each interesting event and time-
`stamping it. generally uses too much resources and also
`distorts the measurements. In addition, when there is too
`much data, difficult choices have to be made on what to
`coliect and analyze. Moresignificantly. in many cases, the
`meaning of the data can be uncertain and ill-defined. And
`finally, there are often inherent biases in the data, as it is
`collected via procedures that skew the data in one direction
`or another.
`The problem is even more complicated because of whatis
`seen by the system as a consumer, may. in fact. involve
`working on behalf of several other consumers. For example.
`
`2
`a database server process (a consumer of system resources)
`actually performs computations on behalf of several data-
`base users who connect
`to it by means of establishing
`separate. concurrent. sessions and sending transactions. In
`such a case, one often wishesto find out resources used by
`each of the applications or transactions or users, separately.
`This problem is made complex because the sum of the
`measurements of the resources used by the database server
`on behalf of its own consumers does not generally agree
`with the database server resource use as measured by the
`system.
`It is therefore incumbent on system administrators to use
`the data with care and to have a systematic technique to
`resolve ambiguities and contradictions.It is to the provision
`of such a technique that the present invention is directed, so
`to provide a generic method that allows system administra-
`tors to combine data from many different
`interrelated
`sources to compute a statistically valid description of the
`system being studied. The invention. furthermore, will be
`described in connection with its illustrative application to
`UNIX systems, though, of course, being useable with other
`systems as well.
`While CPU measurementtechniques have been discussed
`in the literature for many years (Ferrari, D., Computer
`Systems Performance Evolution. Prentice Hall) there are
`inherently severe limitations in the traditional methods. As
`discussed by the present inventors in Agrawal, S. et al..
`“Measurement and is Analysis of Process and Workload
`CPUUtilization in Unix Environments” (to be published in
`Proceedings of CMG96), there are two basic methods to
`obtain system information: event-driven collection and sam-
`pling. In event-driven collection, the operating system alerts
`the collecting tool that a significant event (such as process
`creation, process running, etc.) has occurred. The tool can
`then query the system as to the nature of the event and
`update its tables. The major limitation of this technique is
`that generally there are so mainly significant events that one
`generates moreinformation than can be handled; and there
`is a danger that the collection toolitself will dominate the
`resources of the system.
`On the other hand,a toolthat uses the sampling technique,
`periodically queries the operating system aboutits current
`state and that of all processes. This technique has the
`promise of using considerably feaster resources than an
`event-driven one. However. a difficult trade off must be
`made. If one samples too frequently, there will again be the
`problem of too much data and too much overhead used by
`the tool. If one samplestoo infrequently. on the other hand.
`then there is the potential that much essential information
`will be lost. We have shown. however. that one can measure
`overall CPU utilization with sufficient accuracy using sam-
`pling techniques even though the measurementofindividual
`consumers or groups of consumers is measured less
`accurately. in accordance with the present invention.
`While this is a useful approach. working well if all the
`samples taken by the system can be captured by the mea-
`surementtools, in reality, the system samples the processing
`at a frequency much greater than what can be recorded
`(typically every 1 to 10 milliseconds), and the measurement
`tool itself typically samples the measurements taken by the
`system at a slower frequency (every 10 secondsor so). This
`introduces the problems that a consumer or process may
`terminate between the successive of all of the processes
`taken by the measurement tools and thus its resource use
`between two such successive samples may be lost, and/or a
`consumer or process may be created and terminated between
`the samples taken by the measurementtool.
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`45
`
`50
`
`55
`
`65
`
`
`
`5.761.091
`
`3
`Such errors affect the short-lived process much more than
`long-lived processes. One way to minimizethis type of error
`is to sample much more frequently, but doing so increases
`the overhead of data collection.
`
`The present invention addresses these shortconmings in
`the measurement and analysis of computer system perfor-
`mance data for such purposes as performance analysis.
`diagnosis. investigation, capacity planning. modeling. and
`trending. In short. it assists in all forms of computer and
`application performance management.
`Theinvention achieves its improved results by enhanced
`data collection that captures additional information during
`data collection and essentially recreates data that is lost
`between samples and thus allows the collection of data by
`sampling relatively infrequently, but delivering a better
`quality of the data. This is done by an analysis techniquefor
`such data that provides a truer picture of resource usage.
`Further, the integrated analysis techniques of the invention,
`allow one to combine data from UNIX as well as its
`subsystemsto get a truer picture of resource usage than can
`be obtained by using data from one source only.
`
`OBJECTS OF INVENTION
`
`is to
`An object of the present invention, accordingly,
`provide a new and improved method of and system for
`reducing errors in the measurement of resource usage by
`computer system processes, and analyzing process data with
`subsystem data, thereby obviating or improving upon the
`above-described limitations of prior techniques.
`A further object is to provide a novel and systematic
`method that can be used to analyze and reduce errors in
`measurements that occur when one measures different
`aspects of the same combined system from differenttools or
`vantage points.
`Other and further objects will be explained hereinafter
`and are more particularly delineated in the appended claims.
`SUMMARY
`
`In summary. from one of its important aspects. the inven-
`tion embraces a method of reducing errors in the measure-
`mentof the usage of resources such as CPU’s by computer
`system processes. for such purposes as performanceanalysis
`and planning, that comprises. measuring the resources usage
`by the operating system processes of the computer system
`by periodically sampling the CPU(s) to determine whether
`idle or apparently busy, and if busy, with which process;
`correcting the measurement of resource usage of terminated
`processes; measuring the resource usage by one or more
`process-implemented subsystems of the computer system by
`periodic sampling; and correlating the measurement taken
`by the subsystems with those taken by the operating system
`as corrected.
`
`10
`
`20
`
`25
`
`30
`
`35
`
`45
`
`Preferred and best mode techniques and measurement
`system design are hereinafter more fully presented.
`DRAWINGS
`
`55
`
`The invention will now be described with reference to the
`accompanying drawings, in which
`FIG. I is a general queuing representation of a computer
`system consisting of resources and consumers.
`FIG. 2 illustrates tick-based sampling by the kernel to
`measure the overall CPU utilization and assign it to indi-
`vidual processes.
`FIG.3 explains the parent-child, and. in general, ancestor-
`descendent, relationship of various consumers.
`
`65
`
`4
`FIG. 4 is a listing of some of the attributes or metrics that
`can be collected for a consumer.
`FIG.5 illustrates the basic data collection method outlin-
`ing the core steps of the invention. ic. enhanced data
`collection method.
`
`FIG. 6 shows the basic analysis method for analyzing
`UNIX system-wide and process specific data.
`in which
`processes are grouped into appropriate workloads and
`resource usage for both processes and workloads is adjusted
`for the capture ratio;
`FIG.7 illustrates further refinements to the basic analysis
`method to improve the accuracy of the results;
`FIG.8 illustrates UNIX processes, processes on behalf of
`an Oracic subsystems and the core elements using resources
`within an Oracle;
`FIG.9 illustrates the preferred method for collecting data
`for subsystems, such as the before-mentioned Oracle:
`FIG. 10 illustrates a method for computing the resource
`usage by the workloads and sessions defined within sub-
`systems in accordance with the invention. Using Oracle
`subsystem as an example,
`this figure illustrates how to
`analyze the subsystem data whenthere are two levels of data
`available from the subsystem. The twolevels in this case are
`the database instance tools and the totals for individual
`sessions; and
`FIG.11 illustrates a further modification to the method for
`computing the resource usage by the workloads and sessions
`defined within subsystems in accordance with the invention.
`Using Sybase subsystem as an example, this figure illus-
`trates how to analyze the subsystem data when there are
`three levels of data available from the subsystem. The three
`levels in this case are the database instance totals, individual
`RDBMSusertotals and the totals for individual sessions.
`
`DESCRIPTION OF PREFERRED EMBODIMENT
`(S) OF INVENTION
`As before stated,
`the invention will be described in
`illustrative context of UNIX system performance analysis
`and workload characterization, the performance analysis of
`subsystems such as database management systems and
`transaction processing systems running under UNIX (e.g..
`Oracle, Sybase), and the general procedures for dealing with
`error reductions when the measurements are taken at differ-
`ent levels.
`‘
`
`UNIX PERFORMANCE ANALYSIS PROCESS
`DATA COLLECTION
`
`In FIG. 1, a simplified generic view is presented of a
`computer system with many consumersarriving and receiv-
`ing service from CPU resources.
`On a machine running the UNIX operating system, there
`is a large number of processes running that carry out the
`various demands made on the machine byits users. UNIX
`“measures” overall CPU utilization for the system and the
`processes by “sampling the CPU” on every tick (typically
`every 10 milliseconds) to see if it is busy, and if so, by which
`process, FIG. 2 illustrates this tick-based sampling. repre-
`senting successive CUP “idle” and “seen busy” events.
`These events may occur during the execution of different
`processes.
`These are many common tools available in UNIX to
`report on the CPU utilization and other statistics. These
`utilities including commands such as sar, accounting. ps.
`iostat vmstat. Due to the limitations and overhead of these
`tools, these tools do not provide a complete and consistent
`
`
`
`5.761,091
`
`15
`
`20
`
`25
`
`30
`
`45
`
`6
`This incremented number is compared to the amountthat
`the processes themselves reported that they used. If the
`parent’s records showsthat more was used than was known
`from the terminated processes, the recorded resources used
`by the terminated processes are modified by the extra
`account. If there is more than one terminated process with
`the same parent, in the same sample interval. the extra
`amount is evenly distributed among them.
`Unfortunately,it is not possible to obtain the records ofall
`the processes at exactly the same time. And during the time
`that the collection is taking place, other activity is continuing
`on the machine being monitored. Thus,it is possible for the
`following sequence of events to occur: a) information from
`Process A is collected. b) Process A dies. c) information from
`Process B. Process A's parent,
`is collected. When this
`information is analyzed, Process A will not appear to have
`been terminated yet. However,its data will have been added
`to the Process B data. This anomaly can end up incorrectly
`distributing resources to other children of Process B that
`might have terminated during the sameinterval. This situ-
`ation is handled by looking at the data collected during the
`next sample interval. It
`is at this time that
`it will be
`recognized that Process A hasdied.If it is also noticed that
`Process B information about its children has not made a
`comparable jump. but it made such a jump in the previous
`interval. the adjustments made in the previous interval are
`undone.
`This algorithmic approach has been found to lead to
`highly reliable results.
`
`5
`picture of the system. In addition. unfortunately, significant
`variations exist in the operation and availability of these
`commandson different UNIX variants and the actual mean-
`ing of the measurements these utilities present. Theseutili-
`ties were designed as stand alone tools; each one was
`designed to address the problem thatthe utility designer was
`trying to solve at the time of its design. The procedure for
`underlying measurement is not well documented and sup-
`ported. The outputs of these utilities have varied from
`release to release and from UNIX variant to variant. As a
`result, it takes a large amountof effort correctly to collect,
`understand andinterpret UNIX performance data in consis-
`tent ways.
`Thepresent invention provides a collection tool to over-
`cometheselimitations. Without modifying the UNIX kernel.
`this tool samples the data collected by UNIX inits kernel
`data structures. Such sampling, however, suffers from the
`error that if a process is terminated between such samples,
`information on the resource use since the last sample will be
`lost—errors that the invention overcomes.
`The processes within a UNIX system can be thought of
`being arranged asa tree; that is every process except “init”
`has a parent process. New processes are “forked” off of
`current processes. We will call
`the forking process the
`“parent” and the “forked off” process the “child”. FIG. 3
`illustrates this tree or hierarchy of processes Pa, b, c. d,
`where a, b, c, d are hierarchical numerals.
`Ia Unix when a process dies (terminates), the operating
`system normally places the resourcesit used in the “child”
`resource usage data structure associated with the parent
`process. FIG. 4 shows some typical process attributes or
`metrics collected. The technique uses the information in the
`child resource usage data structure to recover the informa-
`tion about the terminated processes. Prior data collection
`The above method generally results in a process capture
`methods developed over the last 15-20 years. ignore this
`ratio being equal to 1.0. In order to guard against the case
`child resource information. Use ofthis information) requires
`whenit is not, in a post-processing step, the capture ratio is
`complex. novel methods due to several problems and
`computed and the process resource usage time corrected as
`complexities, including: 1) each process has only one struc-
`shown in the flow chart steps of FIG. 6. In the basic data
`ture to record the activity of its terminated children, so if
`collection method of FIG. 5,
`the system and process-
`more than one child process has terminated it is not always
`counters, so-labeled, enable sampling—averaging with pre-
`possible to precisely determine which resources which child
`vious samples—and recording for the system counter flow.
`had used; 2) it can frequently happen that during the same
`and sampling—previous sample comparison—allocating
`interval that a processdies, its parent dies, also, and some-
`resource usage increase re deleted processes—and
`times even the grandparent will have died—and once a
`recording. for the process sampling. In this figure, X and Z
`process is dead, the structures containing information about
`are the parametersthat can be used to fine tune analysis and
`it are no longer available; 3 ) not every dying process will
`correction.
`send its resource information to its parent, and 4 ) because
`of the sequential nature of sampling, it can happen that the
`In FIG. 6, A andBrefer to the totals that are computed in
`50
`information that a parent reports aboutits dead children can
`respective steps. In order to make the data meaningful for
`include information that the tool has not recognized is dead.
`macro-analysis of resource usage. reporting. analysis and
`This can happen when a process dies between the time that
`modeling. the process data is further grouped according to
`its information is recorded and the time that its parent's
`process names, full command name, user name and account
`information is recorded.
`name. This converts detailed process data into resource
`usage statistics for appropriate workloads or business enti-
`In accordance with the invention,
`ties. Capture ratio is also used to correct the workload
`addressed as follows.
`resource usage statistics.
`Each sample period, the current utilization numbers are
`The following is a simple watt to correct inconsistent and
`collected from the operating system for each process. When-
`erroneous data. Compare system-wide data and per con-
`ever it is observed that a process that was running duringthe
`sumer data. When there is a discrepancy. assume that the
`previous sample period is not running now, we look at the
`system-wide data is more likely to be correct and adjust the
`information that was deposited with the parent structure. The
`per consumerdata accordingly. For instance, consider a case
`structures in the parent will generally have been incremented
`in which system-wide data shows that a resource has been
`by the amount of the resources used by the process that has
`used at a 50% rate while the per consumer information
`just terminated,plusall other processes that were terminated
`implies that it was used at a 40% rate. Assuming that the
`during the last sample interval that were forked by this
`system-wide data is correct and that the per consumer data
`process. plus the resources used by processes forked by the
`is all equally wrong, upward adjust all the per consumerdata
`terminated processes.
`
`ANALYSIS OF DATA: POST-PROCESSING TO
`RECONCILE SYSTEM & PROCESS DATA AND
`COMPUTE RESOURCE UTILIZATION BY
`GROUPS OF PROCESSES OR WORKLOADS
`
`these problems are
`
`55
`
`65
`
`
`
`5.761.091
`
`7
`by 20%. Technically, we say we had a “capture ratio” of
`80%; that is. we found at the consumer-level,80% ofall the
`resources reported at the system-wide level.
`For a next order of improvement in the accuracy of the
`corrections. we assign the unaccounted-for CPU. A-B, to
`different processes according to the lifespan of the process.
`Thus. in FIG. 7, once the differences between the computed
`systemwide resourcestotal A and the process resource use B
`and lifespan estimates are determined. a part of the differ-
`ence is allocated to those workloads containing short-lived
`processes, and the remainderis allocated to all workloadsin
`proportion to their respective resource usage.
`
`RESOURCE USAGE ANALYSIS OF
`SUBSYSTEMS.E.G. RDBMS BACKGROUND
`AND PROBLEM STATEMENT
`
`Subsystems Such as databases are implemented in UNIX
`by a set of processes. These products perform work on
`behalf of transactions or sessions that can be further iden-
`tified by attributes su