`
`US 7,748,005 B2
`(10) Patent No.:
`a2) United States Patent
`Romeroetal.
`(45) Date of Patent:
`*Jun. 29, 2010
`
`
`(54) SYSTEM AND METHOD FOR ALLOCATING
`A PLURALITY OF RESOURCES BETWEEN A
`PLURALITY OF COMPUTING DOMAINS
`
`(75)
`
`Inventors: Francisco Romero, Plano, TX (US);
`Cliff McCarthy, Richardson, TX (US);
`Scott Rhine, Frisco, TX (US)
`
`(*) Notice:
`
`.
`(73) Assignee: Hewlett-Packard Development
`Company, L.P., Houston, TX (US)
`a.
`Subject to any disclaimer, the term ofthis
`patent is extended or adjusted under 35
`USC. 154(b) by 1625 days.
`;
`;
`;
`;
`;
`This patent is subject to a terminal dis-
`claimer.
`
`7/1996 Eilertetal. oo. 709/201
`5,537,542 A *
`1/1997 Thompsonetal. .......... 370/485
`5,594,726 A *
`5,675,739 A * 10/1997 Eilertetal. wee 709/226
`5,961,596 A * 10/1999 Takuboet al... 709/224
`
`1/2000 Bertinetal. wo... 370/468
`6,011,804 A *
`6,081,826 A *
`6/2000 Masuokaet al.
`............ 718/100
`6,330,586 BL* 12/2001 Yates etal. we. 709/201
`
`5/2002 Eilertetal. oe 718/105
`6,393,455 BL*
`6,681,232 B1*
`1/2004 Sistanizadeh et al.
`.... 707/104.1
`6,694,419 BL*
`2/2004 Schneeet al... TAL/LT3
`6,738,886 B1*
`5/2004 Mendoza etal.
`.........+ TALV/IT3
`6,775,825 BL*
`8/2004 Grumannetal. «0... 717/127
`2/2005 Brenneret al. ccc 718/100
`6,859,926 BL*
`
`.......... 713/151
`6,922,774 B2*
`7/2005 Meushaw etal.
`6,961,941 B1* 11/2005 Nelsonetal. wu. 719/319
`6,993,762 BL*®
`1/2006 Pierre wecscscsssesssseeeseeee 718/102
`
`(21) Appl. No.: 10/938,961
`(22)
`Filed:
`Sep. 10, 2004
`
`(Continued)
`FOREIGN PATENT DOCUMENTS
`
`(65)
`
`Prior Publication Data
`
`JP
`
`2002-041304
`
`2/2002
`
`US 2005/0039183 Al
`
`Feb. 17, 2005
`‘att
`Dat
`Related U.S. Application
`PPmcaon eee
`ee
`(63) Continuation-in-part of application No. 10/206,594,
`filed on Jul. 26, 2002, now Pat. No. 7,140,020, whichis
`a continuation-in-part of application No. 09/493,753,
`filed on Jan. 28, 2000, now Pat. No. 7,228,546.
`
`(51)
`
`Int. Cl.
`(2006.01)
`GO6F 9/46
`(2006.01)
`G06F 15/173
`(52) US. Ch eececcsne 718/104; 718/102; 718/103,
`;
`; 7091226
`718/100
`(58) Field of Classification Search
`W990
`> IMA 107 .
`See applicationfile for cee.oeee00228
`PP
`P
`ry:
`References Cited
`U.S. PATENT DOCUMENTS
`
`(56)
`
`Primary Examiner—Meng-Ai An
`y
`£
`Assistant Examiner—Caroline Arcos
`
`(57)
`
`ABSTRACT
`
`.
`.
`.
`.
`plurali
`nan embodiment, a computing system comprises a
`I
`plurality
`puting syst
`bod:
`t.
`p
`of resources, a first manager process for allocating the plu-
`rality of resources on a dynamic basis according to service
`level parameters, and a plurality of computing domains,
`Wherein atleast one application, a respective second manager
`process, and a respective performance monitor process are
`executed within each computing domain, and wherein the
`performance monitor generates performance data related to
`the execution of the at least one application and the second
`manager process requests additional resources from thefirst
`managerprocess in responseto analysis of performance data
`in view ofat least one service level parameter.
`
`5,506,975 A *
`
`4/1996 Onodera ...... eee 718/1
`
`20 Claims, 5 Drawing Sheets
`
`
`PARTITION RESOURCE ALLOCATOR
` PARTITION 1
`
`
` 100 >
`
`16-1
`
`101%,|
`
`104
`
` Google Exhibit 1001
`
`Google Exhibit 1001
`Google v. Valtrus
`Google v. Valtrus
`
`
`
`US 7,748,005 B2
`
`Page 2
`
`U.S. PATENT DOCUMENTS
`
`7,096,248 B2*
`7,171,654 B2*
`7,181,743 B2*
`7,191,440 B2*
`7,203,941 B2*
`
`.............. 709/201
`8/2006 Masters etal.
`............... 717/130
`1/2007 Werme etal.
`............... 718/104
`2/2007 Werme etal.
`3/2007 Cota-Robles et al. sees 718/1
`4/2007 Demseyetal. ............06 718/1
`
`9/2007 Cota-Robles et al.
`7,272,831 B2*
`7,281,249 B2* 10/2007 Tarui etal.
`..........
`7,356,817 BIL*
`4/2008 Cota-Robles et al.
`7,433,951 B1* 10/2008 Waldspurger ...
`2002/0049841 Al*
`4/2002 Johnson etal.
`..
`2002/0169987 Al* 11/2002 Meushaw etal.
`* cited by examiner
`
`.......... 718/1
`718/102
`. TBA
`709/226
`.. 709/225
`........0+: 713/201
`
`
`
`
`
`
`U.S. Patent
`
`Jun. 29, 2010
`
`Sheet 1 of 5
`
`US 7,748,005 B2
`
`FIG.
`
`PARTITION RESOURCE ALLOCATOR
`
`103
`
`101
`14
`
`12
`
`APPLICATION
`
`1
`
`0S
`
`|13~'9
`
`|
`
`|
`
`
`
`U.S. Patent
`
`Jun. 29, 2010
`
`Sheet 2 of 5
`
`US 7,748,005 B2
`
` RECEIVE REQUESTS FROM WLMs
`
`$8
`
`
`FIG.
`
`
`DETERMINE
`WHETHER REALLOCATION
`
`
`IS NECESSARY?
`
`
`
`
`DESIGNATE CURRENT TARGET
`AS THE LOWEST OF
`
` ASSIGN ONE RESOURCE
`
`
`1) PREVIOUSLY ALLOCATED
`TO EACH PARTITION IN
`AMOUNT
`
`
`
`ALL PRIORITY GROUPS
`(NOT USED AS PREVIOUS
`
`TARGET)
`
`
`2) REQUESTED AMOUNT OF
`
`
`
`EXAMINE REQUESTS FROM ALL
`ONE PARTITION AMOUNT OF
`
`PARTITIONS FROM THE
`
`THE GROUP (NOT USED AS A
`
`PARTITIONS IN HIGHEST
`
`
`PREVIOUS TARGET) WHEREIN
`UNEXAMINED PRIORITY GROUP
`
`
`
`BOTH 1) AND 2)
`IGNORE
`
`
`PARTITIONS THAT REACHED
`
`
`
`THEIR REQUESTED AMOUNT
`WITHIN
`
`
`
`THE PRIORITY
`GROUP, CAN EACH
`
`
`CAN EACH
`
`PARTITION BE ALLOCATED ITS
`
`
`PARTITION BE
`
`REQUESTED
`
`
`ALLOCATED AN AMOUNT OF
`
`AMOUNT
`
`
`
`RESOURCE TO EQUAL (OR
`EXCEED) THE CURRENT
`
`TARGET?
` DIVIDE EQUALLY
`AMONG
`
`PARTITIONS OF
`
`THE GROUP
`
`
`
`ANY
`WHOSE
`ALLOCATE AN AMOUNT
`
`
`MORE PRIORITY ALLOCATED||7O EQUAL THE CURRENT
`
`
`
`
`
`
`GROUPS?
`TARGET (IGNORING
`
`
`
`AMOUNT IS LESS
`
`
`PARTITIONS THAT HAVE
`THAN CURRENT
`
`
`REACHED THEIR
`
`TARGET
`
`
`
`REQUESTED AMOUNT)
`
`
`
`(IGNORING
`ANY
`PARTITIONS. THAT
`UNALLOCATED
`
`
`
`HAVE REACHED
`AMOUNTS?
`
`
`THEIR
`
`
`
`REQUESTED
`AMOUNT)
`
`
`
`
`
`U.S. Patent
`
`Jun. 29, 2010
`
`Sheet 3 of 5
`
`US 7,748,005 B2
`
`1 2 3 4 4
`
`09
`
`410
`
`412,
`
`413414
`
`415
`
`o1
`
`32
`
`33
`
`FIG. SA
`
`ARBITRATED
`
`VALUES
`
`CUMULATIVE
`
`SUMMING
`
`SUBTRACTION TO
`
`ORUES
`
`
`
`FIG. 4A
`
`PRIORITY
`
`PARTITIONS
`
`1 2 3 4 5 6
`
`CUMLATIVE
`TOTALS
`
`PARTITIONS
`
`PRIORITY
`
`1
`
`2
`
`BOX
`316
`
`1
`
`(
`
`
`
`U.S. Patent
`
`Jun. 29, 2010
`
`Sheet 4 of 5
`
`US 7,748,005 B2
`
`FIG. 5B
` ST=R140=3.5 = 4
`
` S2=R14R2=7.0-—~7
`
` S3=RI4R2+R3=10.0 + 10
`
`R1'=S1-0=4
`
`R2'=S2-S1=3 }
`
`R3'=S3-S2=3
`
`RI=3.5
`
`R2=3.5
`
`R3=3,0
`
`FIG. 5C
`S1=R14+0=10.1 > 10
`
`RI=10.1
`
`R2=20.2
`
`S2=R14R2=30.3 > 30
`
`R3=30.3
`
`S3=R14+R2+R3=60.6 —> 61
`
`RI'=S1-0=10
`
`R2'=S2-S1=20
`
`R3’=S3-S2=31
`
`
`
`R4=39.4 S4=R14R2+R35+R4=100.0 — 100=R4’=54-S3=39
`
`ADAPTER
`
`INTERFACE
`ADAPTER
`
`DISPLAY
`
`
`
`U.S. Patent
`
`Jun. 29, 2010
`
`Sheet 5 of 5
`
`US 7,748,005 B2
`
`HOST OS
`(VIRTUALIZATION LAYER)
`
`7
`
`M
`
`02
`
`701
`
`103
`
`704
`
`
`
`US 7,748,005 B2
`
`1
`SYSTEM AND METHOD FOR ALLOCATING
`A PLURALITY OF RESOURCES BETWEEN A
`PLURALITY OF COMPUTING DOMAINS
`
`RELATED APPLICATIONS
`
`
`
`The present invention is a continuation-in-part of U.S. Pat.
`No. 7,140,020, entitled “DYNAMIC MANAGEMENT OF
`VIRTUAL PARTITION COMPUTER WORKLOADS
`THROUGH SERVICE LEVEL OPTIMIZATION,” issued
`Nov. 21, 2006 which is a continuation-in-part ofU.S. Pat. No.
`7,228,546, entitled “DYNAMIC MANAGEMENT OF
`COMPUTER WORKLOADS THROUGH SERVICE
`LEVEL OPTIMIZATION,” issued Jun. 5, 2007 which are
`incorporated herein by reference.
`
`10
`
`15
`
`FIELD OF THE INVENTION
`
`The present application is generally related to allocating a
`plurality of resources between a plurality of computing
`domains.
`
`20
`
`DESCRIPTION OF RELATED ART
`
`Computer systems inherently have limited resources, par-
`ticularly CPU resources. These limited resources must be
`allocated among the different applications operating within
`the system. A known allocation mechanism for allocating
`system resources to applications is a system knownas a
`Process Resource Manager (PRM). It is used to partition the
`CPU resource and various other resources amongthe differ-
`ent applications. The PRM partitions the resourcesinto frac-
`tions ofthe whole. Thefractions orpieces are then assigned to
`groups of processes, which comprise applications. Each
`application would then receive some portion ofthe available
`resources.
`
`Virtual machine technology (such as the ESX server prod-
`uctavailable from VMware)is another exampleofpartition-
`ing functionality. Virtualization software typically executes
`in connection with a host operating system of the physical
`server. The virtualization software creates virtual resources as
`
`software constructs. The virtual resources are then assigned
`to virtual machines. Specifically, the virtual resources are
`used to execute “guest” operating systemsthat execute on top
`ofthe host operating system. The guest operating systems are
`then used to execute applications. The assignment of the
`virtual resources to the virtual machines thereby allocates
`resources betweenthe respective applications.
`The PRM and similar assignment mechanismsare static
`mechanisms, meaning that the allocation configuration is
`fixed by an administrator, and can only be changed by an
`administrator. In other words,
`the administrator specifies
`wherethe partitions should lie. To configure thepartitions, an
`administrator has to think in terms of the actual machine
`
`resources and the requirements of the different applications.
`Specifically, the administrator analyzes the lowerlevel opera-
`tions of the resources and applications to create the “shares”
`or fractions of system resources to be assignedto each appli-
`cation. Typically, an administrator will vary the configuration
`shares over time to determine an acceptable set of shares for
`the respective applications.
`In an alternative mechanism,a priority based algorithm is
`employed to service applications according to a service
`queue. Specifically, each application is executed in acommon
`computing environment. To control the execution of pro-
`cesses within the common computing environment, applica-
`tions are placed in a queueto receive processing resources.
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`2
`Applications of high priority are serviced from the queue
`before lowerpriority applications. Also, in the priority based
`algorithm, the priorities of the applications can be varied to
`adjust processing performance.
`
`SUMMARY
`
`In an embodiment, a computing system comprises a plu-
`rality of resources, a first manager process for allocating the
`plurality ofresources on a dynamic basis according to service
`level parameters, and a plurality of computing domains,
`whereinat least one application, a respective second manager
`process, and a respective performance monitor process are
`executed within each computing domain, and wherein the
`performance monitor generates performance data related to
`the execution of the at least one application and the second
`manager process requests additional resources from thefirst
`manager process in responseto analysis of performancedata
`in view ofat least one service level parameter.
`In another embodiment, a method comprises creating a
`plurality of computing domains, allocating a plurality of
`resources between the plurality of computing domains,
`executing at least one application, a manager process, and a
`performance monitor process in each ofthe plurality of com-
`puting domains, wherein the performance monitor process
`generates performancedata related to the at least one appli-
`cation and the managerprocess requests additional resources
`in responseto analysis of the performance data in view of at
`least one service level parameter, and dynamically reallocat-
`ing the plurality of resources between the plurality of com-
`puting domainsin response to received requests for additional
`resources according to service level parameters.
`In another embodiment, a computer readable medium
`comprises code for generating performance data related to
`respective applications associated with a plurality of comput-
`ing domains, code for requesting additional resources for
`ones of the plurality of computing domains in response to
`analysis of performancedata from the code for generating in
`view ofat least one service level parameter, and code for
`dynamically allocating resources between the plurality of
`computing domains in response to the code for requesting,
`wherein the code for dynamically allocating determines when
`to reallocate resources using service level parameters associ-
`ated with applicationsof the plurality of computing domains.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`FIG. 1 depicts a partition load manager (PLM)operating
`with a plurality of partitions according to one representative
`embodiment.
`
`FIG. 2 depicts a partition of FIG. 1 according to onerep-
`resentative embodiment.
`
`FIG.3 depicts a flow chart of the operations of the PLM of
`FIG.1 according to one representative embodiment.
`FIGS. 4A and 4B depict examples of allocation of
`resources by the PLM of FIG. 1 accordingto one representa-
`tive embodiment.
`
`FIGS. 5A, 5B, and 5C depict the operation of the rounder
`ofthe PLM ofFIG.1 accordingto one representative embodi-
`ment.
`
`FIG.6 depicts a block diagram ofa computer system which
`is adapted to use one representative embodiment.
`
`
`
`US 7,748,005 B2
`
`3
`FIG. 7 depicts another system adapted according to one
`representative embodiment.
`
`DETAILED DESCRIPTION
`
`Somerepresentative embodiments dynamically respond to
`changes in workload characteristics in a computer system.
`The computer system may comprise a single small computer,
`e.g. a personal computer, a single large computer (e.g. an
`enterprise server), or a network of larger and/or small com-
`puters. The computers, particularly the large computers, or
`the network may be divided into protection domainsor par-
`titions. Each partition may be running its own operating sys-
`tem. An allocation mechanism of one embodimentpreferably
`allows the administrator to think in terms of performance
`goals rather than computer system resources and require-
`ments. Consequently, the administrator preferably defines a
`variety ofperformance goals with different priorities between
`them,and the allocation mechanism will preferably make any
`necessary adjustmentofthe resources. The goals can be pref-
`erably set withoutregard to partitions. For example, a goal for
`a database portion of the computer system could be that a
`retrieval transaction should not take more than 10 millisec-
`
`onds. The allocation mechanism would then manipulate the
`resources to achieve this goal. For multiple partition com-
`puter systems, the resources may be manipulated within a
`partition, e.g. processor time being allocated amongapplica-
`tions, or the resources may be manipulated between parti-
`tions, e.g. reassigning a processor from one partition to
`another(effectively resizing the partitions), or combination
`of both. In another embodiment, resources may beallocated
`between virtual machines by changingthe entitlements asso-
`ciated with the various virtual machines as discussed in
`
`regard to FIG. 7. A scheduling agent may then schedule
`processor resources to threads associated with the virtual
`machines according to the entitlements of the virtual
`machines.
`
`The allocation mechanism preferably includes a partition
`load manager (PLM)that receives resource request informa-
`tion from the partitions of the system. The PLM preferably
`examinesthe resource request information and compares the
`request information with the available resources. Based on
`the comparison, the PLM mayincrease, decrease, or leave
`unchanged, a particular partition’s resources. If the perfor-
`manceofa partition is lagging, e.g., if transactionsare taking
`longer than the goals, then the partition may request an
`increase in the resource entitlement from the PLM.Ifa par-
`tition is over-achieving, then the partition may inform the
`PLM that it has excess resources, and the PLM maydecrease
`its entitlementandallocate it to another partition or partitions.
`Eachpartition preferably includes a work load manager
`(WLM)which operates similarly to the PLM, but operates
`within a particularpartition. An example WLMis morefully
`explained in U.S. application Ser. No. 09/493,753 entitled
`“DYNAMIC MANAGEMENT OF COMPUTER WORK-
`LOADS THROUGH SERVICE LEVEL OPTIMIZATION,”
`filed Jan. 28, 2000, which is hereby incorporated herein by
`reference. Each WLMalso receives goal information and
`priority information from a user or administrator. Note that
`such goal and priority information may be the sameforall
`partitions or the information maybe specific to each partition
`or groups ofpartitions. The WLMalso receives performance
`information from performance monitors, which are processes
`that monitor the performanceof the applications and devices
`within the partition. The WLM examines the information
`from the performance monitors and compares the informa-
`tion with the goals. Based on the comparison, the WLM may
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`4
`increase, decrease, or leave unchanged, an application’s
`entitlement. If the performanceof an application is lagging,
`e.g., if transactions are taking longer than the goal, then the
`WLMincreases the entitlement. If an application is over-
`achieving, then the WLM will decrease its entitlement and
`allocate it to another application.
`The WLMsalso interact with the PLM. Each WLM ini-
`tially and periodically, after determining its resource needs,
`sends resource request information to the PLM. The PLM,
`after receiving such requests, then allocates system resources
`betweenthe partitions. Each WLM,after receiving informa-
`tion about its partition resources, then allocates its allotted
`resources amongthe applications onits partition.
`In multiple partition systems, the PLM mayreside in one
`partition and have access to the otherpartitions. Alternatively,
`the PLM mayreside in a service module that managesall of
`the partitions. Alternatively, the PLM may reside in each
`partition, and cooperatively allocate resources amongst them-
`selves.
`
`the PLM allocates the resources
`In one embodiment,
`betweenthe different partitions, based onthe priorities of the
`partitions and the resource requests. This movement of
`resources is referred to as re-sizing partitions. A partition,
`preferably through its WLM, maintains a list of prioritized
`application goals with an indication of the quantity of each
`required resource. Application goals of equal priority are
`treated equally. (Note that an application may have more than
`one goal.) The requests of higher priority application goals
`are satisfied before lower priority application goals. Unallo-
`cated resources maybeheld in reserveor assignedto a default
`partition. Note that applications of the default partition may
`always be exceeding their goals and thus require a rule that
`such a condition is not an event to cause reallocation of
`
`resourcesor resizing ofpartitions.
`Note that the partition resource entitlements are no longer
`a fixed configuration. As a partition’s needs change, some
`representative embodiments will automatically adjust parti-
`tion entitlements based on resource availability and priority.
`Thus, some representative embodiments are dynamic. Also
`note that the administrator no longerhas to estimatetheinitial
`entitlements as some representative embodiments will deter-
`mine the correct resource allocation to achieve the stated
`goals, and the computer system using some representative
`embodiments will converge on certain partition entitlement
`valuesthat achieve the stated performancegoals. Further note
`that priorities can be assignedto the different goals. Conse-
`quently, different goals can be met based on system resources,
`e.g., with a high amountof resources, all goals can be met,
`however, with a lesser amountofresourcesthe higherpriority
`goal will be met before the lowerpriority goals. Further note
`that changes to the system can be made as soon as the PLM
`receives resource requests, and action by the system admin-
`istrator is not required. Note that in multiple partition sys-
`tems, the administrator may define andprioritize goals that
`apply across all of the partitions and the different operating
`system instances operating in the partitions, instead of only
`being applied within a single partition.
`FIG. 1 depicts the various components of one representa-
`tive embodiment in a multiple partition system having mul-
`tiple partitions 103-1, 103-2, 103-3 .. . 103-N. Each partition
`may have one or more processors and other systems
`resources, e.g. storage devices, I/O devices, etc. Each parti-
`tion is preferably running its own operating system 16-1, ...
`16-N, which provides segregation and survivability between
`the partitions. Note that the different partitions may have
`different amounts ofresources, e.g. different numbersof pro-
`
`
`
`US 7,748,005 B2
`
`5
`cessors. Also note that the partitions may be virtual, as the
`multiple partitions may reside in one or more physical com-
`puters.
`Note that in an initial state the system may have the
`resources evenly divided amongthe partitions. Alternatively,
`the initial state of the system may provide only minimal
`resources to each partition, with the extra resources being
`held in reserve, for example, either unassignedorall placed
`into one or more partitions. The operations of PLM 101 and
`WLMs10 will cause the system resources to be quickly
`allocated in a manner that is most efficient to handle the
`defined goals and priorities for the applications of each of the
`partitions.
`The resources of the computer system are managed by
`PLM 101. PLM 101 receives resource requests from the
`different partitions. The requests can involve multiplepriori-
`ties and multiple types of resources. For example, a request
`maystate that the partition requires two processors and one
`storage device to handle all high priority applications, four
`processors and two storage devices to handle all high and
`medium priority applications, seven processors andfive stor-
`age devices to handle all high, medium, and low priority
`applications. The requests originate from WLMs10-1, .. .
`10-N. WLMs10 preferably produce the requests after total-
`ing the resources necessary to activate their respective goals.
`After receiving one or more requests, PLM 101 preferably
`reviews system resources and determines if reallocation is
`necessary based on existing resources, current requests, and
`the priorities ofthe requests. Thus, ifa particular partition has
`achangein resource requirements, PLM 101 will examine the
`existing requirements of the other partitions with the new
`requirements ofthe particular partition, as well as the current
`resources, to determineif reallocation is necessary. PLM 101
`may also initiate reallocation after a change in system
`resources, e.g. a processor fails, or additional memory is
`added,etc.
`PLM 101 preferably determines whether reallocation is
`necessary by examiningthepriorities ofthe resource request.
`A changein a high level request will typically cause reallo-
`cation. For example, if all device resources are consumedin
`handling high priority operations of the partitions, then a
`change in a low priority request would be ignored. On the
`other hand, a change in a high priority request, e.g. less
`resources needed, will cause reallocation of the resources,
`e.g. the excess resources from the oversupplied partition
`would be re-allocated amongtheotherpartitions based on the
`goals andpriorities of their applications. PLM 101 thencal-
`culates a revised distribution of resources based on the goals
`and priorities of the applications of different partitions. The
`revised distribution is then delivered to partition resource
`allocator 102. Allocator 102 preferably operates to resize the
`partitions, which is to move resources from one or more
`partitions to one or more partitions based on the instructions
`provided by PLM 101. An example of such an allocator and
`partition resizing is described in U.S. application Ser. No.
`09/562,590 entitled “RECONFIGURATION SUPPORT
`FOR A MULTI PARTITION COMPUTERSYSTEM,”filed
`Apr. 29, 2000, the disclosure of which is hereby incorporated
`herein by reference.
`Note that resizing may cause considerable overhead to be
`incurred by the system. In sucha case, moving resources from
`onepartition to another reducesthe available computing time.
`Thus, determination by PLM 101 may include a threshold
`that must be reached before PLM 101 begins reallocation.
`The threshold may include multiple components, e.g. time,
`percent under/over capacity, etc. For example, a small over/
`under capacity may haveto exist for a longer period of time
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`6
`before reallocation occurs, while a large over/under capacity
`may cause an immediate reallocation. This would prevent
`small, transient changes in resource need from causing real-
`locations in the system.
`FIG. 2 depicts the various components of a partition
`according to one representative embodiment. Goals 21 pref-
`erably comprise a configuration file, which is defined by a
`user or system administrator, that describes the users prefer-
`ences with regards to what characteristic(s) ofthe application
`is of interest and is being measured, what is the desired level
`of performance of the application in terms of the character-
`istic, and whatis the priority of achieving this goal. A user can
`also specify time periods for a particular goal to be in effect.
`For example,a first application maybea first database and the
`user will specify in the configurationfile that the character-
`istic is for a particular type of transaction to be completed
`within two seconds, and havea high priority. The application
`mayalso have a second goalfor the same characteristic, e.g.
`the sametypeoftransactions are to be completed within one
`halfofa second, and have a low priority. A second application
`maybe a second database which hasa similar goal as that of
`the first database, namely for a particular type of transaction
`to be completed within two seconds, and have the samepri-
`ority as the first database. Thus, resources would be allocated
`between the two applications, so that the high priority goals
`will be met, and any excess resources would be given to the
`first application so that itcan meetthe lowerpriority “stretch”
`goal.
`WLM10 preferably receives performance information
`which describes the status of a particular characteristic or
`characteristics of each application 12, 13, 14 that is being
`monitored. WLM 10 also receives performance information
`which describes the status and/or other characteristics of the
`
`processors 11 and other devices 25 (e.g. I/O, storage, etc.)
`contained within partition 103.
`The performance information is preferably supplied by
`performance monitor 23. As shownin FIG.2, a single monitor
`is capable of handling multiple applications and devices,
`however, a different embodiment of the present invention
`may have multiple monitors, each monitoring one or more
`applications and devices. Performance monitor 23 is a small
`program that gathers specific information about the applica-
`tion and/or device. For example, if the application is a data-
`base, then a performance monitor measures access times for
`the database. As another example, if a device is a hard drive,
`then the performance monitor may measure data capacity.
`The information need notbestrictly application performance;
`it can be any measurable characteristic of the workload(e.g.
`CPU usage). This information is being gathered continuously
`while the system is operating. The workload manager will
`sample the information at some interval specified by the
`administrator.
`
`The output of the workload manager, derived from the
`ongoing performancereported by the monitors and given the
`goals by the user, is preferably periodically applied to PRM
`15. The output of WLM 10 is the share or entitlement alloca-
`tion to the different resources that is assigned to each appli-
`cation. For example, each share may approximately equate to
`Yoo of a CPU operating second. Thus, within a second, an
`application having an entitlement of 10 will receive io ofthe
`second, providedthat the application has at least one runable
`process. Note that the time received may not be consecutive,
`but rather may be distributed across the one secondinterval.
`Note that a share may also equate to other parameters based
`on the resource being allocated, e.g. a percent of disk storage
`space or actual numberof bytes of disk storage space.
`
`
`
`US 7,748,005 B2
`
`30
`
`35
`
`40
`
`8
`7
`allocation 305 of the requested entitlement by sending the
`Thepartition may have multiple numbers ofresources, e.g.
`allocation informationto the partition resource allocator 102.
`multiple CPUs and/or multiple storage devices. Thus, the
`Note that several messages may besent, with one or more for
`allocation can be placed all on one device or spread amongthe
`each application priority level and/orpartition. Alternatively,
`devices. For example, if a system contains four processors
`one message maybesent at the end 309, which lays out the
`and an allocation of twenty percentof all processor resources
`complete allocation of the resourcesfor all partitions. If not,
`is made, thirty percent of a first processor, ten percent of a
`then PLM 101 preferably arbitrates between the different
`second processor, twenty percent of a third processor, and
`partitions in a fair manner, as discussed with respect to block
`twenty percent of a four processor maysatisfy the total allo-
`310. After satisfying each partition with the application pri-
`cation. The allocation among the different devices is deter-
`ority group in block 305, PLM 101 then determines 306
`mined by the PRM 15. PRM 15 will move the application
`whetherthere are any more application priority groups. If so,
`around to various devices, as needed to attempt to ensure that
`then PLM 101 returns to block 303 andrepeats. If not, then
`it achieves twenty percent allocation.
`PLM determines 307 whether any unallocated resources
`WLM 10 also preferably sends resource requests to PLM
`remain. If not, then PLM 101 is finished 309. The allocated
`101. These requests may take the form ofa list that describes
`resource information is sentto the partition resource alloca-
`the resources required for partition 103 to meetits goals for its
`tor, and PLM 101 is finishedforthis iteration. After receiving
`different priorities. PLM 101 maythen decide to reallocate
`new requests, PLM 101 will begin again in block 301. Ifblock
`resources based ona request. PLM 101 maystore the different
`307 determines that resources are available, then PLM 101
`requests, which would permit PLM 101 to view the changes
`may assign the remaining resources (block 308) to a default
`in the requested resources. This would allow PLM 101 to
`partition, designate the resources as unassigned and hold
`anticipate changesin resources. For example, over a period of
`them in reserve (hoarding), or divide the remaining resources
`time, PLM 101 mayrealize that a particular partition always
`equally amongone or moreofthepartitions. Note that hoard-
`has a need for more resourcesat a particular time (or follow-
`ing may allow somerepresentative embodimentsto operate in
`ing a particular event), e.g. at four p.m., and thus PLM 101
`amoreefficient manner, as the assignment of extra resources
`mayreallocate resourcesto that particular partition before the
`maycausethe partitionsto overachieve their respective goals,
`partition sends a request. The storing of requests would also
`and consequently cause further reallocations, unless a ruleis
`allow for the setting of reallocation triggering criteria. A
`usedto prevent such reallocations. Then PLM 101 ends 309.
`simple trigger could be used that comparesa single message
`If PLM 101 determines in block 304 that the requested
`with the current resource allocation, e.g. a requested increase/
`amount for each partition within the application priority
`decrease of 5% or greater of the current allocation resources
`group cannotbesatisfied, then PLM 101 preferably arbitrates
`would trigger reallocation. More complex triggers could be
`betweenthe differentpartitions in a fair manner. For example,
`usedthat refer to the stored messages. For example, requests
`by designating 310 a current target value as the lowest value
`from a particular partition for increase/decrease of2% to <5%
`of the current allocation resource that continue for more than
`of (1) the lowest of any previously allocated amounts,
`one hour will cause reallocation.
`wherein the previously allocated amounts have not been pre-
`viously used for a target value, or (2) the lowest requested
`In one representative embodiment, PLM 101 may operate
`amountof onepartition of the priority group, which has not
`according to flow chart 300 shownin FIG. 3. PLM 101starts
`been used for a previoustarget value. Note thatcriteria (1) and
`operations 301 and receives 302 the resource requests from
`(2) do not includepartitions that have reached their requested
`WLMs. PLM 101 then optionally determines whetherto ini-
`amounts, as this will simplify the performance flow of PLM
`tiate reallocation 315. PLM 101 may compare the resource
`101 as depicted in FIG. 3 (namely, by reducing the numberof
`requests with the current allocations. If a particular partition
`times that blocks 310, 311, 312, and 313 are repeated). In
`has a request (for more or less resources) that exceeds a
`block 311, PLM 101 determines whether the target amount
`predeterminedthreshold, as compared with a current alloca-
`tion, then PLM 101 mayinitiate reallocation. Also, PLM 101
`f