`Eilert et al.
`
`54]
`
`175}
`
`APPARATUS AND METHOD FOR
`MANAGINGA DISTRIBUTED DATA
`PROCESSING SYSTEM WORKLOAD
`ACCORDING TO A PLURALITY OF
`DISTINCT PROCESSING GOAL TYPES
`
`Inventors: Catherine Krueger Eilert; Peter
`Bergersen Yocom, both of Wappingers
`Falls, N.Y.
`
`[73]
`
`Assignee:
`
`International Business Machines
`Corporation, Armonk, N.Y.
`
`[21]
`
`[22]
`
`{51]
`[52]
`
`[58]
`
`[56]
`
`Appl. No.: 383,168
`
`Filed:
`
`Feb. 3, 1995
`
`Tt, C18 onceeessssonssesensnestecsersessesenssessnssess G06F 13/00
`WLS. C1. on esscemeerereereeternee 395/200.11; 395/200.03;
`395/180.01; 395/671; 395/672
`Field of Search ........ccccsssessecssessseenee 395/650, 700,
`395/670, 671, 672, 675, 180.01, 200.03,
`200.11
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`
`4,914,570
`3,031,089
`5,109,486
`
`4/1990 Peacock ......sccescceceeseestseeneee 364/200
`
`-- 364/200
`7/1991 Liu et al.
`..
`4/1992 Seymour 20.0...ccsscessssssousseee 395/200
`
`US005675739A
`(11) Patent Number:
`[45] Date of Patent:
`
`5,675,739
`Oct. 7, 1997
`
`OTHER PUBLICATIONS
`
`Silvermanetal, “A Distributed System for Parallel Process-
`ing”, Software Practice and Experience, vol. 19(12),
`1163-1174, Dec. 1989.
`Shivaratri et al, “Load Distributed for Locally Distributed
`Systems”, IEEE Computer, Dec. 1992, pp. 33-44.
`
`Primary Examiner—Kevin A. Kriess
`Assistant Examiner—Michael T. Richey
`Attorney, Agent, or Firm—Lawrence D. Cutter; Heslin &
`Rothenberg, P.C.
`
`[57]
`
`ABSTRACT
`
`An apparatus for managing a workload distributed across
`data processing systems in accordance with a common
`performance standard, which includes a means for measur-
`ing the performance of the work units to create local
`performance data; a means for sending said local perfor-
`mance data to at least one other system; a means for
`receiving performance data from at least one other system to
`create remote performance data; and a meansresponsive to
`said local and remote performance data for adjusting at least
`one of the system control parameters to modify the perfor-
`mance of the work units on the system to achieve the
`common performancestandard, is disclosed. Also disclosed
`is a method for managing workload as represented by the
`apparatus.
`
`42 Claims, 17 Drawing Sheets
`
`DATA TRANSMISSION
`
`COMPUTER SYSTEM 100C
`Ft VIERot}.
`JcuCOMPUTER SYSTEM1008
`ASCBQ_—403 1
`102
`
`,
`
`105
`
`DISPATCHER
`
`104
`
`FACILITY
`
` |
`MANAGER||
`OPERATING SYSTEM 101
`CLASS TABLE ENTRY
`|
`-------+++ 5
`hor MPL_|_110-~YRESPONSE]
`108
`I
`U1
`TARGET
`TIME
`IMPORTANCE
`11132
`SPT
`WM
`I
`I
`TARGET
`VELOCITY
`
`[133_PPS 157 151
`
`
`
`
`i} REMOTE.||[MULTI~SYSTEM]TARGET |
`1['25-N RESPONSE||[PERFORMANCESAMPLE
`
`
`DATA
`TIME
`INDEX
`IK oe
`ery
`PERFORMANCE
`i
`4126~ RESP.
`REMOTE
`INDEX
`i
`i
`HISTORY
`HISTORY.
`182
`beepHISTORY |J
`pono b+-aa taee ls
`r
`CALCULATE_| |
`1115 P]PERFORMANCE
`{
`I
`INDEX
`1
`i
`i
`Se
`aa
`i
`p|MeSADAaes|17
`FIND
`
`1
`jt25
`BOTTLENECK
`FX 18 9
`| 120 CPU];
`y150—~7SPT |i
`+ 13t~TPPSTI
`17" MPL
`TT
`
`! | I | ! i(3= In~Woaneeeeeeeeaeee
`
`Loweee
`
`Google Exhibit 1031
`Google Exhibit 1031
`Google v. Valtrus
`Google v. Valtrus
`
`I\ i{1 \
`
`||
`
`{ii1I
`
`1s
`
`
`SAMPLE
`
`3
`153
`
`L
`
`PERFORMANCE
`DATA
`
`I1j1ii|i|
`
`
`
`US. Patent
`
`Oct. 7, 1997
`
`Sheet 1 of 17
`
`5,675,739
`
`DATA TRANSMISSION
`____-COMPUTERSYSTEM100C
`
`COMPUTER SYSTEM 100B
`
`FACILITY
`
`
`
`
`|
`
`105
`
`IntASCBQ, 103|
`102
`DISPATCHER
`
`104
`
`|
`
`PERFORMANCEtq
`
`MANAGER
`OPERATING SYSTEM 101
`CLASS TABLE ENTRY
`106eS LELL 4
`1107
`TION
`RESPONSE|
`108
`| IMPORTANCE||TARGET TIME
`
`
`
`139
`ne
`IMPORTANCE
`|
`
`
`TARGET|VELOCITY|133| 157 151 !
`
`
` DATA
`MULTI-SYSTEM]
`TARGET
`REMOTE
`t125-NV
`SAMPLE
`RESPONSE
`PERFORMANCE
`l
`HISTORY
`158 HISTORY
`LOCAL
`ft
`TIME
`1126
`RESP.
`REMOTE
`INDEX
`TIME
`VELOCITY
`IL
`HISTORY
`HISTORY
`152
`tOSRM 112
`1 mamaab ---=== —aMte
`|
`rTP CALCULATE_] |
`|
`|
`44500]
`PERFORMANCE
`|_|
`SANPLE
`|
`INDEX
`l
`SELECT
`RECEIVER
`
`| |!
`
`| | | |
`
`
`
`pomaereeeeee
`
`|
`
`l | l l
`
`7 I
`
`|
`
`u
`
`2 5
`
`1s
`12
`MCDPC 114
`
`YT]
`
`116—~J
`ay
`
`23
`
`FIND
`BOTTLENECK
`FIX 18 y-
`
`
`
`US. Patent
`
`Oct. 7, 1997
`
`Sheet 2 of 17
`
`5,675,739
`
`
`
`CALCULATE MULTI-SYSTEM
`& LOCAL PERFORMANCE INDEXES
`
`
`
`
`GENERATE READY USER AVERAGE,
`MPL DELAY, AND SWAP DELAY GRAPHS
`202
`
`SET PASS TO MULTI-SYSTEM
`
`201
`
`
`
`
`207
`YES
`
`
`
` 212
`
`SET PASS
`TO LOCAL
`
`203
`
`204
`
`205
`
`206
`
`208
`REALLOCATE
`RESOURCES
`
`214
`
`NO
`
`SEND
`POLICY DATA
`
`216
`
`EXIT
`
`219
`
`ROLL DATA
`HISTORIES
`
`
`
`US. Patent
`
`Oct. 7, 1997
`
`Sheet 3 of 17
`
`5,675,739
`
`301
`
` SET IMP = HIGHEST USED
`
`
`
`ANY GOAL
`CLASS AT IMP
`
`MISSING GOALS
`2.
`
` SELECT
`
`
`SELECT
`GOAL CLASS
`GOAL CLASS
`WITH WORST
`WITH WORST
`MULTI-SYS PI
`PI AT IMP
`AT IMP
`
`
`
`
`NEXT LOWER
`IMPORTANCE
`
`
`SELECT
`GOAL CLASS
`WITH WORST
`LOCAL PI
`
`FIG.3
`
`MULTI-SYS PI
`
`303
`
`SELECT
`GOAL CLASS
`WITH WORST
`
`NO
`
`.
`
`304
`
`
`
`US. Patent
`
`Oct. 7, 1997
`
`Sheet 4 of 17
`
`5,675,739
`
`STATE SAMPLES FOR FIND BOTTLENECK
`
`SAMPS|SAMPS
`
`CPU
`DELAY
`
`MPL
`DELAY
`
`AUX
`PAGING
`DELAY
`
`1G.4
`
`501
`CPU DELAY LARGEST YES|SET SELECTED FLAG
`
`AND NOT SELECTED
`SET BOT = CPU
`?
`
`|
`
`|
`
`502
`
`
`
`
`
`NO
`
`503
`
`
`
`MPL DELAY LARGEST
`
`
`AND NOT SELECTED
`2
`
`YES_|
`
`NO
`
`504
`
`Cc
`SET SELECTED FLAG
`SET BOT = MPL
`
`506
`505
`
`
`
`
`SWAP DELAY LARGEST YES||SET SELECTED FLAG
`
`AND NOT SELECTED
`SET BOT = SWAP
`
`NO
`
`SET SELECTED FLAG FOR
`907
`
`
`YESSITYPE OF PAGING DELAY
`PAGING DELAY LARGEST
`
`
`AND NOT SELECTED
`SET BOT = PAGING
`?
`DELAY TYPE
`
`
`
`508
`
`
`
`U.S. Patent
`
`Oct. 7, 1997
`
`Sheet 5 of 17
`
`5,675,739
`
`601 NO ELIGIBLE
`
`DONOR
`
`607
`
`GOAL CLASSES
`
`MEETING GOALS THAT
`
`OWN THE REQUIRED
`
`
`RESOURCE
`
`?
`
`604
`
`NO
`
`NO
`
`YES
`
`608
`
`3
`
`VES
`
`NO
`
`YES
`
`606
`605
`SELECT GOAL
`SELECT GOAL
`
`
`
`
`SET IMP =
`CLASS WITH BEST
`CLASS WITH BEST
`
`
`
`
`
`
`
`
`
`LOWEST USED]|MULTI-SYS PI AT LOCAL Pl AT
`IMP THAT OWNS
`IMP THAT OWNS
`
`
`
`|REQUIRED RESOURCE
`REQUIRED RESOURCE]
`
`
`
`609,
`
`GOAL CLASS
`AT IMP OWNING
`
`THE REQUIRED
`
`RESOURCE
`?
`
`
`
`
`
`SELECT GOAL
`CLASS WITH BEST
`MULTI-SYS PI AT
`
`IMP THAT OWNS
`REQUIRED RESOURCE
`
`
`
`
`
`615
`
`
`SELECT GOAL
`CLASS WITH BEST
`
`
`
`
`LOCAL PI AT
`IMP THAT OWNS
`REQUIRED RESOURCE
`
`
`
`NO
`
`614
`
`
`
`
`NO ELIGIBLE
`DONOR
`
`el
`
`IG.6
`
`SET IMP = NEXT
`HIGHER IMPORTANCE
`
`
`
`US. Patent
`
`Oct. 7, 1997
`
`Sheet 6 of 17
`
`5,675,739
`
`701
`
`
`SELECT NEW VALUE
`FOR CONTROL VARIABLE
`
`
`
`
`CALCULATE
`PERFORMANCE INDEX DELTA
`
`YES
`
`704
`
`FIND DONORS FOR
`RESOURCE NECESSARY
`
`707
`
`NO ACTION
`CAN BE TAKEN
`
`RECEIVER VALUE
`
`
`
`
`
`
`
`
`
`NET VALUE
`
`WITH DONORS
`
`?
`
`
`YES
`
`
`CHANGE CONTROL
`VARIABLE
`
`706
`
`
`
`
`
`US. Patent
`
`Oct. 7, 1997
`
`Sheet 7 of 17
`
`5,675,739
`
`ooATS
`
`
`DONOR PROJECTED
`TO MEET GOALS
`MEETING COAL
`
` RECEIVER
`804
`
`
`MEETING GOALS
`,
`
`
`
`
`
`806 RECEIVER
`
`
`
`
`YES
`
`YES
`
`NO
`
`REALLOCATION
`HAS NET VALUE
`
`80
`
`3
`
`80
`
`5
`
`REALLOCATION DOES
`NOT HAVE NET VALUE
`
`REALLOCATION
`HAS NET VALUE
`
`8
`
`07
`
`809
`
`REALLOCATION DOES
`NOT HAVE NET VALUE
`
`MORE IMPORTANT
`?
`
`
`
`REALLOCATION
`REALLOCATION DOES
`HAS NET VALUE
`NOT HAVE NET VALUE
`
`
`
`
`
`Sheet 8 of 17
`
`|
`
`5,675,739
`
`U.S. Patent
`
`Oct. 7, 1997
`
`CALCULATE MAXIMUM DEMAND
`
`AND. WAIT—TO—USING RATIOS
`
`
`
`U.S. Patent
`
`Oct. 7, 1997
`
`Sheet 9 of 17
`
`5,675,739
`
`FIG.10
`
`903,912
`
`pono enone === aa
`
`CALCULATE MAXIMUM DEMAND
`
`AT AFFECTED PRIORITIES
`
`OTTTTTe
`
`on DEMAND PLOT
`
`95
`
`ACHIEVABLE
`DEMAND
`PERCENTAGE
`
`0
`
`1.0
`
`2.0
`
`4.0
`
`FIG.11
`——_
`
`MAXIMUM DEMAND AVAILABLE TO. PRIORITY
`MAXIMUM DEMAND AT PRIORITY
`
`RATIO
`
`MOTHAND
`DEMAND
`AVAILABLE|PERCENTAGE
`
`ABLE
`
`MAXIMUM
`
`
`
`US. Patent
`
`Oct. 7, 1997
`
`Sheet 10 of 17
`
`5,675,739
`
`1201
`
`FIND WAIT—TO—USING
`ABOVE AND BELOW
`
`1202
`
`1203
`
`NO<tcwwIS> INTERPOLATE
`
`1204 1205
`
`YES
`
`<100% CMD ABOVE>>—E>
`
`fas <100% CMD
`
`YES
`
`INFLATE
`
`1206
`
`°:
`
`NO
`
`1207
`
`<o
`
`YES?
`INFLATE
`
`1208
`
`1215
`
`OVERRIDE DOWN
`
`1210
`>NO
`
`WAS <100%
`2
`
`OVERRIDE DOWN
`
`DEFLATE
`
`DIVIDE BY 2
`
`YES
`
`1211
`
`1213
`
`OVERRIDE UP
`
`OVERRIDE UP
`
`FIG.12
`
`
`
`US. Patent
`
`Oct. 7, 1997
`
`Sheet 11 of 17
`
`5,675,739
`
`
`
`NO
`
`1302
`1303
`
`
`PROJ. USING SAMPS =
`
`
`PROJ. CPU TIME
`
` CPU TIME PER SAMP
`
`PROJ. USING SAMPS =
`ACT USING SAMPS +
`PROJ. CPU TIME
`
`ACT CPU TIME
`
`FIG.13
`
`READ PROJECTED NUMBER OF USERS
`OFF READY USER AVERAGE PLOT ~
`
`1401
`
`|-1402
`
`SELECT NEW NUMBER OF MPL SLOTS
`
`
`
`
`READ CURRENT AND PROJECTED MPL|-1403
`DELAY OFF MPL DELAY PLOT
`
`CALCULATE PERFORMANCE INDEX DELTA
`
`1404
`
`1406
`1405
`YES|FIND DONORS FOR
`STORAGE NECESSARY
`
`1408
`INCREASE MPL SLOTS
`
`FIG.14
`
`YES
`
`1407
`
` NET VALUE
`WITH DONORS
`
`
`?
`
`
`No 4
`NO ACTION CAN BE TAKEN
`
`
`
`U.S. Patent
`
`Oct. 7, 1997
`
`Sheet 12 of 17
`
`5,675,739
`
`READY USER AVERAGE PLOT MAXIMUM
`
`READY
`USERS
`
`MPL SLOTS AVAILABLE
`TO CLASS
`
`FIG.15
`
`MPL DELAY PLOT
`
`MPL
`DELAY
`
`‘SWAP
`DELAY
`
`PERCENTAGE OF READY USERS
`WHO HAVE MPL SLOTS
`
`FIG.16
`
`SWAP
`DELAY PLOT
`
`AVERAGE TIME SWAPPED
`IN PROCESSOR STORAGE
`
`FIG.18
`
`
`
`USS. Patent
`
`Oct. 7, 1997
`
`Sheet 13 of 17
`
`5,675,739
`
`1701
`
`SELECT NEW SWAP PROTECT TIME
`
`
`
`READ CURRENT AND PROJECTED SWAP
`DELAY OFF SWAP DELAY PLOT
`
`FIND DONORS FOR STORAGE NECESSARY
`
`1705
`
`1708
`NO ACTION
`CAN BE TAKEN
`
`1706
`
`NO
`
`
`
`
`NET VALUE:
`WITH DONORS
`?
`
`YES
`
`1707
`
`INCREASE PROTECT TIME
`
`FIG.17
`
`
`
`US. Patent
`
`Oct. 7, 1997
`
`Sheet 14 of 17
`
`5,675,739
`
` 9
`
`.
`1903
`YES
`ASSESS Pl DELTA OF INCREASING PROTECTIVE PROCESSOR
`STORAGE TARGET FOR ALL ADDRESS SPACES
`
`FIND DONORS FOR STORAGE NECESSARY}—1904
`
`1905
`
`
`NET
`NO
`VALUE. WITH
`
`
`DONORS
`
`9
`1906
`YES
`INCREASE PROTECTIVE PROCESSOR STORAGE TARGETS
`
`1907
`NO ACTION
`CAN BE TAKEN
`
`1908
`
`ASSESS Pl DELTA OF INCREASING THE PROTECTIVE
`PROCESSOR STORAGE TARGET OF ONE ADDRESS SPACE
`
`FIND DONORS FOR STORAGE NECESSARY,—1 309
`
`
`
`NO
`
`=}
`
`NO ACTION
`CAN BE TAKEN
`
`1911
`YES ¥
`INCREASE PROTECTIVE PROCESSOR STORAGE TARGETS
`ASSESS PI DELTA OF INCREASING COMMON’S}—
`1915
`PROTECTIVE PROCESSOR STORAGE TARGET
`FIND DONORS FOR STORAGE NECESSARY)
`
`
`
`1916
`YES
`INCREASE PROTECTIVE PROCESSOR STORAGE TARGETS
`
`1915
`
`NO
`
`|9'4
`
`1917
`
`NO ACTION
`CAN BE TAKEN
`
`FIG.19
`
`
`
`U.S. Patent
`
`Oct. 7, 1997
`
`Sheet 15 of 17
`
`5,675,739
`
`ROLL COUNTER
`VALUE
`2002
`
`2001
`
`COUNT
`
`ROLL
`
`D
`
`
`
`
`
`< 0.500 * GOAL
`
`< 0.575 * GOAL
`
`< 0.650 * GOAL
`
`< 0.725 * GOAL
`
`< 2.00 + Goa7FIG.21
`< 2.50 * GOAL
`
`< 3.00 * GOAL
`
`< 5.00 * GOAL
`
`> 5.00 * GOAL
`
`
`
`U.S. Patent
`
`Oct. 7, 1997
`
`Sheet 16 of 17
`
`5,675,739
`
`SET C TO
` FIRST GOAL CLASS
`
`
`
`“2301
`
`
`
`2502
`
`
`
`GET VELOCITY
`GET RESPONSE
`DATA FROM
`TIME DATA FROM
`
`
`REMOTE VELOCITY
`REMOTE RESPONSE
`
`
`DATA HISTORY
`TIME HISTORY
`
`
`
`
`
`CALCULATE PI
`
`CALCULATE PI
`
`
`
`
` 2509
`
`
`
`MORE
`CLASSES
`?
`
`SET C TO
`NEXT GOAL CLASS
`
`FIG.23
`
`
`
`U.S. Patent
`
`5,675,739
`
`Oct. 7, 1997
`Sheet 17 of 17
` 2201
`
`DATA
`FROM SYSTEM WITH
`
`
`SAME GOALS
`?
`
`NO
`
`:
`
`
`
`RESPONSE
`TIME GOAL
`2205
`
`
`
`ADD DATA TO
`ADD DATA TO
`REMOTE RESPONSE
`REMOTE VELOCITY.
`
`
`
`TIME HISTORY
`HISTORY
`
`
`
`
`
`FIG.22
`~—
`
`DATA SENT FOR GOAL CLASS
`WITH A RESPONSE TIME GOAL:
`
`
`2401
` GOAL CLASS NAME
`
`
`ARRAY EQUIVALENT TO
`REMOTE RESPONSE TIME HISTORY ROW
`
`
`2402
`
`DATA SENT FOR GOAL CLASS
`WITH A VELOCITY GOAL:
`
`REMOTE VELOCITY HISTORY ROW
`
`ARRAY EQUIVALENT TO
`
`FIG.24
`
`
`
`5,675,739
`
`1
`APPARATUS AND METHOD FOR
`MANAGINGA DISTRIBUTED DATA
`PROCESSING SYSTEM WORKLOAD
`ACCORDING TO A PLURALITY OF
`DISTINCT PROCESSING GOAL TYPES
`
`INTRODUCTION
`
`The present invention relates to managing the perfor-
`mance of user application program execution across a set of
`interconnected, cooperating, independent computer systems.
`The user application programs do the useful work thatis
`the purpose of the computer systems. These application
`programs form the units of work whose execution is man-
`aged by the computer’s operating system software. These
`application programs are said to form a distributed workload
`in the sense that they are independent programs that are
`dispersed across the set of systems for execution, while
`sharing a commonset of performance goals.
`These computer systems are said to be independentin the
`sense that each is an entirely separate, wholly functional
`computer system whose resources are controlied by it’s own
`copy of an operating system, such as IBM’s Multiple Virtual
`Storage/Enterprise Systems Architecture (MVS/ESA)™
`operating system. These computer systems are said to be
`cooperating in the sense that each is exchanging operational
`measurement data with the other computer systems in the
`set. These computer systems are said to be interconnected in
`the sense that each is able to transmit its operational mea-
`surement data to all the other systems in the set and to
`receive the analogous data from those other systems in the
`set. These computer systems comprise a set in the sense that
`they all have the characteristics cited above.
`Thepresentinvention relates to managing performance iin
`the sense that the operating system is adapted to establish a
`performance goal class for each distinct performance goal
`established for the set of systems; classifying each applica-
`tion program, in whichever system or systems of the set a
`given application program is to be executed, into its respec-
`tive goal class as a part of scheduling the application
`program for execution; controlling the execution of the
`application programs so that the application programs that
`are assigned to a given goal class do meet the performance
`goal across all the systems, taken asa set.
`BACKGROUND OF THE INVENTION
`
`Management of computer system performance has
`reached the point where the existing process controls are
`inadequate. The primary problem is one of complexity,
`given the large number of interrelated factors that affect
`actual achieved performance. Complexity is also signifi-
`cantly increased when the workload to be managed is
`distributed across multiple independent systems. In a single
`system, the invention disclosed in U.S. patent application
`Ser. No. 08/222,755, “APPARATUS AND METHOD FOR
`MANAGING A DATA PROCESSING SYSTEM WORK-
`LOAD ACCORDING TO TWO OR MORE DISTINCT
`PROCESSING GOAL TYPES” by J. D. Aman et al.
`(Aman), filed Apr. 4, 1994 and assigned to the assignee of
`the present invention, simplifies workload management by
`allowing management of work units according to end-user
`oriented goals. This application is incorporated by reference
`hereby.
`Current operating system software is not able to take over
`the responsibility for managing the performance of a work-
`load that is distributed across a plurality of interconnected,
`cooperating computer systems according to end-user ori-
`ented goals specified for that set of computer systems taken
`as a group.
`
`2
`SUMMARYOF THE INVENTION
`
`The present invention allows specification of a perfor-
`mance goal for each of a plurality of user performance goal
`classes for a workload that is distributed across a plurality of
`interconnected, cooperating but independent computer sys-
`tems. Exemplary goals may be specified in terms of response
`time or execution velocity. The relative importance of
`achieving a goal mayalso be specified with each goal. Given
`a common awareness of the user performance goals, the
`operating system on each system being managed takes on
`responsibility for the allocation of its system resources to
`executing work units such that those goals are best achieved
`across the set of computer systems being managed.
`Tradeoffs are made to best utilize the available capacity,
`based on actual achievement toward goals and the types of
`computer system resources required to achieve those goals.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`The detailed description explains the preferred embodi-
`ments of the present invention, together with advantages and
`features, by way of example with reference to the following
`drawings.
`FIG. 1 is a system structure diagram showing particularly
`a set of computer systems, each having a controlling oper-
`ating system and system resource manager component
`adapted as described for the present invention.
`FIG.2 is a flowchart showing the overall logic flow in the
`goal-driven performance-controller component.
`FIG. 3 is a flowchart showing logic flow for the select-
`receiver function.
`FIG. 4 illustrates the state data used to select resource
`bottlenecks.
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`FIG. 5 is a flowchart showing logic flow. for the find-
`bottleneck function.
`
`FIG. 6 is a flowchart showing logic flow for the select-
`donor function.
`
`FIG.7 is a flowchart showing logic flow for the fix means.
`FIG.8 is a flowchart showing the general logic flow for
`the assess-net-value function for proposed changes.
`FIG. 9 is a flowchart of the steps to assess improving
`performance by increasing dispatch priority.
`FIG. 10 is a flowchart showing logic flow for projecting
`the effects of changing dispatching priorities.
`FIG. 11 is a sample graph of achievable demand.
`FIG. 12 is a flowchart for calculating a new. wait-to-using
`Tatio.
`
`FIG. 13 is a flowchart for calculating CPU time using
`sample deltas.
`FIG. 14 is a flowchart of the steps to assess improving
`performance by increasing MPL slots.
`FIG. 15 is a sampie graph of ready user average.
`FIG. 16 is a sample graph of MPL delay.
`FIG. 17 is a flow chart showing the steps to assess
`improving a receiver’s performance by increasing its swap
`protect time target.
`FIG. 18 is a sample graph of swap delay.
`FIG.19 is a flow chart showing the steps to take toreduce
`a receivers auxiliary storage paging delay.
`FIG. 20 showsthe parts of a data history
`FIG.21 diagrams a row of the remote response time data
`history.
`FIG. 22 is a flow chart of the steps taken by the remote
`data receiver.
`
`45
`
`55
`
`65
`
`
`
`5,675,739
`
`3
`FIG. 23 is a flow chart of the steps taken to calculate the
`moulti-system performance index.
`FIG. 24 showsthe data sent to the remote systems for both
`response-time type performance goal classes and for
`execution-velocity type performance goal classes.
`
`DETAILED DESCRIPTION OF THE
`INVENTION
`
`FIG.1 illustrates the environmentand the key features of
`the present invention for an exemplary embodiment having
`three interconnected, cooperating computer systems. The
`environment of this invention is that of a workload being
`distributed across a set of computer systems. This invention
`allows management of such a workload through a single
`policy for the set of systems. Having a single policy for the
`set of systems helps provide a single-image view of the
`distributed workload. Those skilled in the art will recognize
`that any number of such interconnected, cooperating com-
`puter systems may be used without departing from the spirit -
`or scope of this invention. The three computer systems
`(100-A, 100-B, 100-C) are executing a distributed workload,
`and each is controlled by its own copy of an operating
`system (101) such as IBM’s MVS/ESA™,
`Each of these copies of the operating system executes the
`steps described in this specification. When the description
`below refers to the local system, it means the system that is
`executing the steps being described. The remote systems are
`all the other systems being managed. Note that each system
`considers itself local and all other systems remote.
`Adispatcher (102) is a component of the operating system
`that selects the unit of work to be executed next by the
`computer. The units of work (150) are the application
`programs that do the useful work that is the purposeof the
`computer system. The units of work that are ready to be
`executed are represented by a chain of control blocks in the
`operating system memory called the address space control
`block (ASCB) queue (103). Each ASCB hasa field that
`contains a relative importance (dispatching priority) value
`(104). The dispatching priority is set by operation of the
`present invention and used by the dispatcher to select to be
`executed the highest priority unit of work from those that are
`ready to be executed, as determined by the ASCB queue.
`The dispatching priority is a controlled variable provided by
`the present invention for meeting the stated performance
`goals for computer system operation.
`The present invention takes as input the performance
`goals established by a system administrator and stored (141)
`on a data storage facility (140). The goals apply to work
`across all the systems being managed (100-A, 100-B, 100-
`C). The data storage facility (140) is accessible by each
`system being managed. The performance goals illustrated
`here are of two types: response time (in seconds) and
`execution velocity (in percent). Those skilled in the art will
`recognize that other goals, or additional goals, may be
`chosen without departing from the spirit or scope of this
`invention. Included with the performancegoals is the speci-
`fication of the relative importance of each goal. The goais
`(141) are read into each system by a workload manager
`(WLM)component (105) of the operating system on each of
`the systems being managed. Each of the goals, which were
`established and specified by the system administrator, causes
`the workload manager on each system to establish a perfor-
`manceclass to which individual work units will be assigned.
`Each performanceclass is represented in the memory of the
`operating systems by a class table entry (106). The specified
`goals (in an internal representation) and other information
`
`25
`
`30
`
`45
`
`so
`
`55
`
`60
`
`65
`
`4
`relating to the performance class are recorded in the class
`table entry. Other information stored in a class-table entry
`includes the multiprogramming level (MPL) target value
`(107) (a controlled variable), the relative importance of the
`goal class (108) (an input value), the multi-system perfor-
`mance index (151),
`the local performance index (152)
`(computed values). the response time goal (110) (an input
`value), the execution velocity goal (111) (an input value),
`sample data (125) (measured data), the remote response time
`history (157) (measured data), the remote velocity history
`(158)
`(measured data).
`the sample data history (125)
`(measured data), and the response time history (126)
`(measured data).
`.
`The goal driven performance controller of the system
`resource manager (SRM) component (112) of the operating
`system (Aman) is modified according to the present inven-
`tion to be a multi-system goal-driven performance-
`controller (MGDPC) component (114) and to operate in the
`following manner as shown in FIG. 1. The MGDPC per-
`forms the functions of measuring the achievementof goals,
`selecting the user performance goal classes that need their
`performance improved, and improving the performance of
`the user performance goal classes selected by modifying the
`controlled variables of the associated work units. as
`described later. The MGDPC function is performed periodi-
`cally based on a periodic timer expiration approximately
`every ten seconds in the preferred embodiment.
`At (115), a multi-system performance index and a local
`performance index are calculated for each user performance
`goal class (106) using the specified goal (110 or 111). The
`multi-system performance index represents the performance
`of work units associated with the goal class, acrossall the
`systems being managed. The local performance index rep-
`resents the performance of work units associated with the
`goal class on the local system. The resulting performance
`indexes are recorded in the corresponding class table entry
`(106) at (151) and (152). The concept of a performance
`index as a method of measuring user performance goal
`achievement is well known. For example, in U.S. patent
`application Ser. No. 07/876,670, “Workload Manager for
`Achieving Transaction Class Response Time Goals in a
`Multiprocessing System”, by D. Ferguson,et al., filed Apr.
`30, 1992 and assigned to the assignee of the present
`invention, the performance index is described as the actual
`response time divided by the goal response time. This
`application is incorporated by reference hereby.
`At (116), a user performance goal class is selected to
`receive a performance improvement in the order of the
`telative goal importance (108) and the current value of the
`performance indexes (151,152). The selected user perfor-
`mance goal class is referred to as the receiver. The MGDPC
`first uses the multi-system performance index when choos-
`ing a receiver so that the action it takes has the largest
`possible impact on causing work units to meet goals across
`all the systems being managed. When there is no action to
`take based on the multi-system performance index, the local
`performanceindex is used to select a receiver that will most
`help the local system meetits goals.
`The performance bottleneck, relative to the controlled
`variables, is determined (117) by using state samples (125),
`a well known technique. The controlled variables are pro-
`tective processor storage target (133) (affects paging delay),
`swap protect time target (132) (affects swap delay), multi-
`programming level (MPL)target (107) (affects MPL delay),
`and dispatch priority (104) (affects CPU delay).
`At(118), the potential changes to the controlled variables
`are considered. A user performance goal class is selected
`
`
`
`5,675,739
`
`5
`(123) for which a performance decrease can be made based
`on the relative goal importance (108) and the current value
`of the performance indexes (151, 152). The user perfor-
`mance goal class thus selected is referred to as the donor.
`Next, the proposed changes are assessed (124) for net value
`relative to the expected changes to the multi-system and
`local performance indexes for both the receiver and the
`donor for each of the controlled variables (dispatch priority
`(120) (104), the swap protect time target (130) (132), the
`protective processor storage target (131) (133), and the MPL
`target (121) (107)). A proposed change has net value if the
`result would yield more improvementfor the receiver than
`harm to the donor relative to the goals. If the proposed.
`changehasnetvalue, thenthe respective controlled variable
`is adjusted for both the donor and the receiver.
`Each system to be managed is connected to a data
`transmission mechanism (155) that allows each system to
`send data records to every other system. At (153) a data
`record describing the recent performance of each goal class
`is sent to every other system.
`The multi-system goal driven performance controller
`(MGDPC) function is performed periodically, (once every
`ten secondsin the preferred embodiment) andis invoked via
`a timer expiration. The functioning of the MGDPC provides
`a feedback loop forthe incremental detection and correction
`of performance problems so as to make the operating system
`adaptive and self-tuning.
`At (154) aremote data receiver receives performance data
`from remote systems asynchronously from the MGDPC.
`The received data is placed in a remote performance data
`histories (157,158) for later processing by the MGDPC.
`
`PRIMARY PROCESSING STEPS
`
`FIG. 2 is a flowchart showing the logic flow for the
`primary processing steps of the multi-system goal driven
`performance controller (MGDPC)of the present invention.
`The primary objective of the MGDPC is to meet perfor-
`mance goals across all the systems being managed. This
`objective is met without any centralized control. Instead,
`each system receives performance data from all the other
`systems being managed and, based on its view of how the
`entire distributed workload is doing, makes resource allo-
`cation decisions to best meet goals. A secondary objective of
`the MGDPC is to meet performance goals locally, in which
`case resource allocation decisions are made using local and
`remote data.
`
`At (201), a multi-system and a local performance index
`are calculated for cach user performance goal class and the
`current values are calculated for the ready-user-average
`graph (FIG. 15),
`the MPL-delay graph (FIG. 16), and
`swap-delay graph (FIG. 18). (The performance index cal-
`culation is described below.) Up to two passes may be made
`through steps (203}(213). The first pass is called the
`multi-system pass. During the multi-system pass, receivers
`and donors ate chosen based on how a goal class is per-
`forming across all the systems being managed. During the
`second pass, called the local pass, receivers and donors are
`chosen based on both how a goal class is performing locally
`in addition to how it is performing across all the systems
`being managed. The local pass is performed only when no
`action is taken during the multi-system pass. At (202), a loop
`control variabie is set to indicate that the multi-system pass
`is currently being processed. At (203), a user performance
`goal class, referred to as the receiver, is selected to haveits
`performance improved. The selection process is shown in
`more detail in FIG. 3, described later. At (204), one of the
`
`10
`
`15
`
`20
`
`25
`
`35
`
`45
`
`50
`
`55
`
`65
`
`6
`receiver's resource bottlenecks is selected as the bottleneck
`to be addressed. Bottleneck selection is shown in more detail
`in FIG. 5, described later. At (205), user performance goal
`classes that own the resource identified as required to
`improve the performanceof the receiver are selected. These
`selected user performance goal classes are referred to as
`donors. Donors are selected in reverse order to receivers,
`that is, the user goal classes having the best performance
`indexes and least importanceare selected. Donor selectionis
`shown in more detail in FIG. 6, described later.
`At (206), the effects of the resource reallocation from the
`donor to the receiver are projected. The algorithms used to
`project the effects of resource reallocation depend on the
`resource involved. Each of the algorithms is described
`below in this specification. At (207),
`the net value of
`reallocating the resource from the donor or donors to the
`receiver is assessed. A receiver will only be improved by
`reallocating resource from a specific donor if there is pro-
`jected to be net positive value to the resource reallocation.
`If using a donor to improve a receiver is projected to result
`in more harm to the donor than improvementto the receiver
`relative to the goals and importance, the resource realloca-
`tion is not done. Net value assessment is shown in more
`detail in FIG. 8, describedlater.
`the
`If there is net positive value to the reallocation,
`resources are reallocated from the donor or donors to the
`receiver at (208). If there is not net positive value, a check
`is made at (209) to determine whether there is another
`potential donor.
`If there is another potential donor, control passes to (205)
`to select another potential donor. If there are no more
`potential donors of the resource required to address the
`selected bottleneck, a check is made at (210) to determine
`whether the receiver has another bottleneck.
`If the receiver has another bottleneck that can be
`addressed, control returns to (204) to select another bottle-
`neck. If the receiver has no more bottlenecks to address, a
`check is made at (211) to determine whether there is another
`potential receiver.
`If there is another potential receiver, control returns to
`(203) to select another potential receiver. If there are no
`more potential receivers, a check is made at (212) to see if
`the loop control variable indicates that the current pass is the
`multi-system pass. If the current pass is the multi-system
`pass, at (213) the loop control variable is set to indicate that
`the currentpass is the local pass and control returnsto (203)
`to select another potential receiver.
`If the local pass has already been taken, control passes to
`(214) to send performance data to each of the remote
`systems that is being managed. At (215) data histories are
`rolled (see description of data histories below).
`The MGDPC function is invoked again when the timer
`next expires, providing feedback on the effect of the
`resourcereallocations made previously and again providing
`the opportunity to address performance problems.
`DATA HISTORIES
`
`A data history is a mechanism to collect and analyze data
`over time. By using data histories the MGDPC canuse data
`that has enough samples to be representative without using
`data that is so old thatit might be out of date. FIG. 20 shows
`an exemplary data history. A data history contains 6 rows of
`data (2003) and a roll counter (2001). Each row represents
`data from a range of time in history. Row 1 contains data
`from the most recent period only. Rows 2-6 contain varying
`tanges of older data. The roll counter controls whento roll
`
`
`
`5,675,739
`
`7
`a row of data from one time range to a time range further
`back in time. The roll counter is incremented each MGDPC
`interval. Each row has associated with it a number (2002)
`that corresponds to the roll counter value specifying when
`the data in the row should be ‘rolled’. Row 1°s row counter
`value is 1, which means row 1 is rolled every interval. Row
`2’s interval number is 2; row 2 is rolled every second
`interval. Row 3’s roll counter value is 4, row 3’s roll counter
`value is 16, row 5’s roll counter value is 64, and row 6’s roll
`counter value is 64.
`
`Data is added to the history as follows. New data is added
`to row 1. At the end of each MGDPC interval the highest
`row whose roll counter value evenly divides into the current
`roll counter value is found. The contents of that row is added
`to the next numerically higher row. The contents of all the
`numerically lower rows are moved up one row,leaving row
`1 empty. Whenit is time to roll data out of row six, the data
`is discarded. To get d