`
`
`
`Liu et al.
`
`A
`
`
`
`[19]
`
`
`
`
`
`[11] Patent Number:
`
`
`
`
`[45] Date of Patent:
`
`5,031,089
`
`
`Jul. 9, 1991
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`Attorney, Agent, or Firm—Thomas H. Jones; Harold W.
`Adams; John R. Manning
`
`
`
`
`
`
`ABSTRACI‘
`[57]
`
`
`
`
`
`In a distributed heterogeneous computer system having
`
`
`
`
`
`
`
`a plurality of computer nodes each operatively con-
`
`
`
`
`
`
`
`
`nected through a network interface to a network to
`
`
`
`
`
`
`
`provide for communications and transfers of data be-
`tween the nodes and wherein the nodes each have a
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`queue for containing jobs to be performed, an improve-
`
`
`
`
`
`
`for dynamically reallocating the system’s re-
`ment
`
`
`
`
`
`
`
`
`sources for optimized job performance. There is first
`
`
`
`
`
`
`
`
`logic at each node for dynamically and periodically
`
`
`
`
`
`
`
`
`
`calculating and saving a workload value as a function of
`the number of jobs on the node’s queue. Second logic is
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`provided at each node for transfering the node’s work-
`
`
`
`
`
`
`
`
`
`
`
`load value to other nodes on the network at the request
`
`
`
`
`
`
`
`
`
`
`
`of the other nodes. Finally, there is third logic at each
`
`
`
`
`
`
`
`
`
`
`node operable at the completion of each job. The third
`
`
`
`
`
`
`
`
`logic includes, logic for checking the node’s own work-
`
`
`
`
`
`
`
`
`
`
`
`load value, logic for polling all the other nodes for their
`workload value if the checking node’s workload value
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`is below a preestablished value indicating the node as
`
`
`
`
`
`
`
`
`
`being underutilized and available to do more jobs, logic
`
`
`
`
`
`
`
`
`
`
`for checking the workload values of the other nodes as
`
`
`
`
`
`
`
`
`
`
`received, and logic for transfering a job from the queue
`
`
`
`
`
`
`
`
`
`
`of the other of the nodes having the highest workload
`value over a preestablished value indicating the other of
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`the nodes as being overburdened and requiring job
`
`
`
`
`
`
`
`
`
`
`
`
`relief to the que of the checking node. The third logic is
`also operable periodically when the node is idle.
`
`
`
`
`
`
`
`
`
`
`
`
`[54] DYNAMIC RESOURCE. ALLOCATION
`SCHEME FOR DISTRIBUTED
`
`
`
`HETEROGENEOUS COMPUTER SYSTEMS
`
`
`
`
`
`
`
`[75]
`
`
`Inventors: Howard T. Liu, San Marino; John A.
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`Silvester, Los Angeles, both of Calif.
`
`
`
`
`
`
`
`[73] Assignee: United States of America as
`
`represented by the Administrator,
`
`
`
`National Aeronautics and Space
`
`
`
`
`
`
`Administration, Washington, D.C.
`
`
`
`
`
`
`
`
`
`[21] Appl. No.: 292,124
`
`[22] Filed:
`
`
`
`Dec.30,1988
`
`
`
`
`
`
`
`
`[51]
`Int. CL5 ............................................ .. G06F12/00
`
`
`
`
`
`
`
`[52] U.S.Cl. ............................... .. 364/200;364/281.3;
`364/281; 364/281.6; 364/281.8
`
`
`
`364/200 MS File, 900 MS File,
`
`
`
`
`
`
`
`
`364/281.3, 281, 281.6, 281.8, 975.5
`
`
`
`
`
`
`[58] Field of Search
`
`
`
`References Cited
`
`
`U.S. PATENT DOCUMENTS
`
`
`
`7/1978 Hoschler et al.
`4,099,235
`................. .. 364/200
`
`
`
`
`
`
`4,403,286 9/1983 Fry et al.
`-.......................... .. 364/200
`
`
`
`
`
`
`4,413,318 11/1983
`l-Ierrington ............... .. 364/200
`
`
`
`
`
`4,495,570
`1/1985 Kitajima et al.
`364/200
`
`
`
`
`
`
`3/1986 Ballew et al.
`. .. .. .
`. . . .. 364/200
`4,577,272
`
`
`
`
`
`
`4,839,798
`6/1989 Eguchi et al.
`. . . ... 364/200
`. .. . ..
`
`
`
`
`
`
`7/1989 Tsushima et al.
`................. .. 364/401
`4,852,001
`
`
`
`
`
`
`
`
`
`[56]
`
`
`
`Primary Examiner—Joseph A. Popek
`
`
`
`
`
`
`
`Assistant Examiner—Rebecca L. Rudolph
`
`
`
`16 Claims, 3 Drawing Sheets
`
`
`
`
`
`
`
`
`
`‘
`
`
`
`
`
`
`
`
`
`
`COMPUTER#3
`
`NETWORK
`
`INTERFACE
`
`
`TASK
`
`ALLOCATION
`
`6: TRANSFER
`
`
`LOGIC
`
`
`
`
`
`
`
`
`
`
`WAKEUP
`LOGIC
`
`
`
`WORKLOAD
`
`INDICATOR
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`—
`
`
`
`NETWORK
`
`IN TERFACE
`
`
`TASK
`
`ALl.OCATlON
`
`8: TRANSFER
`
`
`LOGIC
`
`
`
`
`
`COMPUTER#2
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`WAKEUP
`LOGIC
`
`
`
`
`
`
`WORKLOAD
`INDICATOR
`
`
`
`
`
`
`
`
`
`
`
`
`WORKLOAD
`INDICATOR
`
`
`
`
`
`
`
`
`
`
`
`
`j
`
`
`
`
`
`5:
`n:
`
`I‘-‘:3
`
`EL
`
`
`
`
`
`
`2Oo
`
`NETWORK
`
`INTERFACE
`
`
`TASK
`
`ALLOCATION
`
`& TRANSFER
`
`
`LOGIC
`
`
`
`
`WAKEUP
`
`LOGIC
`
`
`
`
`Page 1 of 11
`
`FIS Exhibit 1010
`
`Page 1 of 11
`
`FIS Exhibit 1010
`
`
`
`
`
`
`
`530190‘;210Iwas1661‘5Kmwaned'5'[1
`
`COMPUTER NODE
`
`22
`
`J08 EXECUTION
`
`WORKLOAD
`INDICATOR
`
`I’
`
`II IIIIIIIIIIIIIIIIIIIIIIIII
`
`TASK
`ALLOCATTON
`
`AND TRANSFER
`LOGIC
`
`______ sgm/|cE
`RA15
`
`FIG. 3
`
`
`CONTROL
`COMPUTER
`r-----1
`
`
`
`
`COMPUTER
`NO.n
`r-----1
`LTA5'S§_I
`
`
`
`
`COMPUTER
`NO. n
`[I/:s*:§3
`
`Page2of11
`
`
`
`
`
`311919¢I'S°fl
`
`
`
`
`
`
`
`1661‘6Klnr
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`680‘I£O‘S9J°ZWIS
`
`
`
`
`
`
`
`
`
`
`
`—
`
`
`NETWORK
`
`INTERFACE
`
`
`
`
`
`
`
`
`
`
`
`
`COMPUTER#3
`
`
`
`
`
`
`
`
`
`
`TASK
`
`ALLOCATION
`
`
`& TRANSFER
`
`LOGIC
`
`
`
`
`WAKEUP
`
`LOGIC
`
`
`
`
`WORKLOAD
`
`INDICATOR
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`—
`
`
`NETWORK
`
`INTERFACE
`
`
`
`
`
`COMPUTER#2
`
`
`
`
`
`
`
`
`
`
`
`
`TASK
`
`ALLOCATION
`
`
`& TRANSFER
`
`LOGIC
`
`
`
`
`WAKEUP
`
`LOGIC
`
`
`
`
`
`
`
`WORKLOAD
`
`INDICATOR
`
`
`
`
`
`
`’
`
`
`
`
`
`
`
`
`
`
`
`—
`
`
`NETWORK
`
`INTERFACE
`
`5?:
`n:
`
`
`. I5
`:>
`
`n.
`
`
`
`
`2O0
`
`
`
`
`
`
`TASK
`
`ALLOCATION
`
`
`8c TRANSFER
`
`LOGIC
`
`
`
`
`WAKEUP
`
`LOGIC
`
`
`
`
`
`
`
`WORKLOAD
`
`INDICATOR
`
`
`
`
`
`
`
`
`
`Page3of11
`
`Page 3 of 11
`
`
`
`U.S. Patent
`
`July 9. 1991
`
`Sheet 3 of 3
`
`5,031,089
`
`JOB
`
`Y
`
`COMPLETE
`
`OBTAIN
`
`WORKLOAD
`
`OF OTHER
`
`NODES
`
`WORKLOAD
`
`BELOW
`
`LIMIT
`
`WORKLOAD
`
`ABOVE
`
`LIMIT
`
`TRANSFER
`
`JOB IN
`
`EXIT
`
`
`
`5,031,089
`
`1
`
`
`DYNAMIC RESOURCE ALLOCATION SCHEME
`
`
`
`FOR DISTRIBUTED HETEROGENEOUS
`
`
`
`COMPUTER SYSTEMS
`
`
`
`
`
`ORIGIN OF THE INVENTION
`
`
`
`
`
`
`
`
`
`
`
`
`The invention described herein was made in the per-
`formance of work under a NASA contract, and is sub-
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`ject to the provisions of Public Law 96-517 (35 USC
`
`
`
`
`
`
`
`
`
`202) in which the Contractor has elected not to retain
`title.
`
`
`
`
`
`
`
`l0
`
`
`
`l5
`
`
`
`20
`
`
`25
`
`
`30
`
`
`35
`
`
`
`
`45
`
`
`
`
`
`
`
`
`
`50
`
`
`55
`
`
`60
`
`
`65
`
`
`
`
`
`
`
`
`
`
`
`2
`
`
`
`
`
`
`
`
`puter systems according to the known prior techniques
`
`
`
`
`
`
`
`for load distribution and redistribution. Their finding
`
`
`
`
`
`
`
`
`
`
`
`
`will now be set forth by way of example to provide a
`
`
`
`
`
`
`
`
`
`
`' clear picture of the background and basis for the present
`invention.
`
`
`
`
`
`
`
`
`The imbalance probability, IP, for a heterogeneous
`
`
`
`
`
`
`
`system canlbe calculated by mathematical techniques
`
`
`
`
`
`
`
`
`
`
`
`
`well known to those skilled in the art which, per se, are
`
`
`
`
`
`
`
`
`
`
`
`no part of the novelty of the present invention. There is
`
`
`
`
`
`
`
`
`a finite, calculatable probability that I out of N comput-
`
`
`
`
`
`
`
`
`
`ers comprising a networked system are idle. There is
`
`
`
`
`
`
`
`
`
`
`also a finite probability that all stations other than those
`
`
`
`
`
`
`
`
`
`
`
`
`I stations are busy, as well as a probability that there is
`
`
`
`
`
`
`
`
`
`
`exactly one job in each one of the remaining (N-I) sta-
`
`
`
`
`
`
`
`
`
`
`
`tions, i.e. a finite probability that at least one out of (N-I)
`
`
`
`
`
`
`
`
`
`
`stations has one or more jobs waiting for service. By
`
`
`
`
`
`
`
`
`
`
`summing over the number of idle stations, from I to N,
`
`
`
`
`
`
`
`
`
`the imbalance probability for the whole system can be
`
`
`
`
`
`
`
`
`obtained. By way of example, in a homogeneous system,
`
`
`
`
`
`
`
`
`
`
`all the nodes (i.e. computers 12) have the same service
`rate and the same arrival rate. As the number of nodes
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`increases, the peak of the imbalance probability goes
`
`
`
`
`
`
`
`
`
`
`higher. As the number of nodes increases to twenty, the
`
`
`
`
`
`
`
`imbalance probability approaches 1 when the traffic
`
`
`
`
`
`
`
`
`
`
`intensity (arrival rate divided by the service rate at each
`
`
`
`
`
`
`
`
`
`node) ranges from 40% to 80%. The statistical curves
`
`
`
`
`
`
`
`
`
`also indicate that the probability of imbalance is high
`
`
`
`
`
`
`
`
`
`during moderate traffic intensity. This occurs due to the
`fact that all nodes are either idle (i.e. there is low traffic
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`intensity) or are busy (i.e. there is high traffic intensity).
`
`If the arrival rate is not evenly distributed, the imbal-
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`ance probability becomes even higher. In the imbalance
`
`
`
`
`
`
`probability of a two-node heterogeneous system, the
`faster node is twice as fast as the slower one and the
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`work is evenly distributed. If the work is not balanced,
`
`
`
`
`
`
`
`
`
`
`it has been observed that the imbalance probability goes
`
`
`
`
`
`
`
`
`
`even higher during high traffic intensity at the slower
`
`
`
`
`
`
`
`
`
`
`node. At this point, the slower node is heavily loaded
`
`
`
`
`
`
`
`
`
`even though the faster node is only 50% utilized.
`
`
`
`
`
`
`
`Numerous studies have addressed the problem of
`
`
`
`
`
`
`
`resource-sharing in distributed systems. It is convenient
`
`
`
`
`
`
`
`
`
`to classify these strategies as being either static or dy-
`namic in nature and as having either a centralized or
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`decentralized decision-making capability. One can fur-
`
`
`
`
`
`
`
`
`
`
`ther distinguish the algorithms by the type of node that"
`
`
`
`
`
`
`
`takes the initiative in the resource-sharing. Algorithms
`can either be sender-initiated or server-initiated. Some
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`algorithms can be adapted to a generalized heteroge-
`
`
`
`
`
`
`
`
`
`
`neous system while others can only be used in a homo-
`
`
`
`
`
`
`geneous system. These categories are further explained
`as follows:
`
`
`Static/Dynarriic: Static schemes use only the infor-
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`mation about the long-term average behavior of the
`
`
`
`
`
`
`
`
`system,
`they ignore the current state. Dynamic
`i.e.
`
`
`
`
`
`
`
`
`schemes differ from static schemes by determining how
`
`
`
`
`
`
`
`
`
`and when to transfer jobs based on the time-dependent
`
`
`
`
`
`
`
`
`
`current system state instead of the average behavior of
`
`
`
`
`
`
`
`
`
`the system. The major drawback of static algorithms is
`
`
`
`
`
`
`
`
`
`that they do not respond to fluctuations of the work-
`
`
`
`
`
`
`
`load. Dynamic schemes attempt to correct this draw-
`
`
`
`
`
`
`
`
`
`back but are more difficult to implement and may intro-
`additional overhead.
`In addition, dynamic
`duce
`
`
`
`
`
`
`
`
`
`
`schemes are hard to analyze.
`
`
`
`
`Centralized/Decentralized: In a system with central-
`
`
`
`
`
`
`
`
`
`
`ized control, jobs are assumed to arrive at the central
`
`
`
`
`
`
`
`
`controller which is responsible for distributing the jobs
`among the network’s nodes; in a decentralized system,
`
`
`
`
`
`
`
`
`
`TECHNICAL FIELD
`
`
`
`
`
`
`
`
`
`The invention relates to resource allocation in com-
`
`
`
`
`
`
`
`
`puter systems and, more particularly, to a method and
`
`
`
`
`
`
`
`associated apparatus for shortening response time and
`
`
`
`
`
`
`improving efficiency of a heterogeneous distributed
`
`
`
`
`
`
`
`networked computer system by reallocating the jobs
`
`
`
`
`
`
`
`
`
`
`
`queued up for busy nodes to idle, or less-busy nodes. In
`
`
`
`
`
`
`
`accordance with a novel algorithm, the load-sharing is
`
`
`
`
`
`
`
`
`
`
`‘initiated by the_ server device in_a manner such that
`
`
`
`
`
`
`
`
`
`extra overhead is not imposed on the system during
`
`
`heavily-loaded conditions.
`BACKGROUND ART
`
`
`
`
`
`
`
`
`
`
`In distributed networked computer systems there is a
`
`
`
`
`
`
`
`
`
`
`high probability that one of the workstations will be idle
`
`
`
`
`
`
`
`
`while others are overloaded. Thus, the response times
`
`
`
`
`
`
`
`
`
`
`
`
`for certain tasks are longer than they should be if all the
`
`
`
`
`
`
`
`
`
`
`_capabilities in the system could be shared fully. As is
`known in the art, the solution is to reallocate tasks from
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`queues connected to busy computers to idle computer
`queues.
`
`
`
`
`
`
`
`
`
`As depicted in FIG. 1, a distributed computer system
`
`
`
`
`
`
`
`
`
`
`10 consists of several computers 12 with the same or
`
`
`
`
`
`
`different processing capabilities, connected together by
`
`
`
`
`
`
`
`
`
`
`
`a network 14. Each of the computers 12 has tasks 16
`
`
`
`
`
`
`
`
`assigned to it for execution. In such a distributed multi-
`
`
`
`
`
`
`
`
`
`
`computer system, the probability is high that one of the
`
`
`
`
`
`
`
`
`
`computers 12 is idle while another computer 12 has
`
`
`
`
`
`
`
`
`
`
`
`more than one task 16 waiting in the queue for service.
`
`
`
`
`
`
`
`This probability is called the “imbalance probability”.
`
`
`
`
`
`
`
`A high imbalance probability typically implies poor
`
`
`
`
`
`
`
`system performance. _By reallocating queued tasks or
`
`
`
`
`
`
`
`
`jobs to the idle or lightly-loaded computers 12, a reduc-
`
`
`
`
`
`
`
`
`
`tion in system response time can be expected. This tech-
`
`
`
`
`
`
`
`
`
`
`
`
`nique is called “load sharing” and is one of the main foci
`
`
`
`
`
`
`
`
`
`
`of this invention. As also depicted in FIG. 1, such redis-
`tribution of the tasks 16 on a dynamic basis is known in
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`the art. Typically, there is a control computer 18 at-
`tached to the network 14 containing task lists 20. On
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`various bases,
`the control computer 18 dynamically
`
`
`
`
`
`
`
`
`
`reassigns tasks 16 from the lists 20 to various computers
`
`
`
`
`
`
`
`
`
`
`
`12 within the system 10. For example, it is known in the
`
`
`
`
`
`
`
`
`
`
`art to have each of the computers 12 provide the con-
`
`
`
`
`
`
`
`
`
`
`trol computer l8 with a indicator of the amount of
`
`
`
`
`
`
`
`
`
`computing time on tasks that is actually taking place.
`
`
`
`
`
`
`
`
`The control computer 18, with knowledge of the
`
`
`
`
`
`
`
`
`
`
`amount of use of each computer 12 available, is then
`able to reallocate the tasks 16 as necessary. In military
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`computer systems, and the like, this ability to reconfig-
`
`
`
`
`
`
`
`
`
`ure, redistribute, and keep the system running is an
`
`
`
`
`
`
`
`
`
`
`important part of what is often referred to as “graceful
`
`
`
`
`
`
`
`
`
`degradation”; that is, the system 10 continues to operate
`
`
`
`
`
`
`
`
`
`
`
`
`
`as best it can to do the tasks at hand on a priority basis
`
`
`
`
`
`
`for as long as it can.
`.
`-
`The inventors herein did a considerable amount of
`
`
`
`
`
`
`
`
`
`
`
`
`
`statistical analysis and evaluation of networked com-
`
`
`
`Page 5 of11
`
`Page 5 of 11
`
`
`
`5,031,089
`
`
`
`
`
`3
`
`
`
`
`
`
`
`
`
`jobs are submitted to the individual nodes and the deci-
`
`
`
`
`
`
`
`
`
`
`
`sion to transfer a job to another node is made locally.
`
`
`
`
`
`
`
`This central dispatcher approach is quite restrictive for
`
`a distributed system.
`
`
`
`
`
`
`
`Homogeneous/Heterogeneous system: In the homo-
`
`
`
`
`
`
`
`
`geneous system, all the computer nodes are identical
`and have the same service rate. In the heterogeneous
`
`
`
`
`
`
`
`
`
`system, the computer nodes do not have the same pro-
`
`
`
`
`
`
`
`
`
`
`
`cessing power.
`Sender/Server Initiated: If the source node makes a
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`determination as to where to route a job, this is defined
`
`
`
`
`
`
`as a sender-initiated strategy. In server-initiated strate-
`
`
`
`
`
`
`
`gies, the situation is reversed, i.e., lightly-loaded nodes
`
`
`
`
`
`
`
`
`
`
`search for congested nodes from which work may be
`transferred.
`
`
`
`
`
`
`
`
`
`
`The prior art as discussed in the literature (see Listing
`of Cited References hereinafter) will now be addressed
`
`
`
`
`
`
`
`
`
`with particularity.
`
`
`
`
`
`
`
`
`
`First, there are the static strategies. Stone [Ston 78]
`
`
`
`
`
`
`
`developed a centralized maximum flow algorithm for
`
`
`
`
`
`
`
`
`two processors (i.e. computer nodes) by holding the
`
`
`
`
`
`
`
`
`
`
`
`load of one processor fixed and varying the load on the
`
`
`
`
`
`
`
`
`
`other processor. Ni and Hwang [Hwan 81] studied the
`
`
`
`
`
`
`
`
`problem of load balancing in a multiple heterogeneous
`
`
`
`
`
`
`
`
`
`processor system with many job classes. In this system,
`
`
`
`
`
`
`
`
`
`the number of processors was extended to more than
`two. Tantawi and Towsley [Tant 85] formulated the
`
`
`
`
`
`
`
`
`
`
`
`
`
`static resource-sharing problem as a nonlinear program-
`
`
`
`
`
`
`
`ming problem and presented two efficient algorithms,
`
`
`
`
`
`
`the parametric-study algorithm and the load-balancing
`
`
`
`
`
`
`
`
`
`problem. Silva and Gerla [Silv 84] used a downhill
`
`
`
`
`
`
`
`
`
`queueing procedure to search for the static optimal job
`
`
`
`
`
`
`
`assignment
`in a heterogeneous system that supports
`
`
`
`
`
`
`
`multiple job classes and site constrains. Recently,
`
`
`
`
`
`
`
`
`
`Kurose and Singh [Kuro 86] used an iterative algorithm
`
`
`
`
`
`
`
`to deal with the static decentralized load-sharing prob-
`
`
`
`
`
`
`
`
`lem. Their algorithm was examined by theoretical and
`
`
`simulation techniques.
`
`
`
`
`
`
`
`
`Next, there are the dynamic strategies. Chow and
`
`
`
`
`
`
`
`
`
`Kohler [Chow 79] used a queueing theory approach to
`
`
`
`
`
`
`examine a resource-sharing algorithm for a heteroge-
`
`
`
`
`
`
`
`neous two-processor system with a central dispatcher.
`
`
`
`
`
`
`
`
`Their objective was to minimize the mean response
`
`
`
`
`
`
`
`
`
`
`time. Foschni and Salz [Fosc 79] generalized one of the
`
`
`
`
`
`
`
`
`methods developed by Chow and Kouler to include
`
`
`
`
`
`
`
`
`multiple job dispatchers. Wah [Wah 84] studied the
`communication overhead of a centralized resource-
`
`
`
`
`
`
`
`
`
`
`
`
`sharing scheme designed for a homogeneous system.
`
`
`
`
`
`Load-balancing of the Purdue ECN (Engineering Com-
`
`
`
`
`
`
`
`puter Network) was implemented with a dynamic de-
`
`
`
`
`
`centralized RXE (remote execution environment) pro-
`
`
`
`
`
`
`
`
`
`gram [Hawn 82]. With the decentralized RXE, the load
`
`
`
`
`
`
`
`
`
`information of all the processors was maintained in each
`
`
`
`
`
`
`
`
`network machine’s kernel. One of the problems with
`
`
`
`
`
`
`
`
`
`this approach is the potentially high cost of obtaining
`
`
`
`
`
`
`
`
`
`
`the required state information. It is also possible for an
`
`
`
`
`
`
`
`
`idle processor to acquire jobs from several processors
`
`
`
`
`
`
`
`
`
`and thus become overloaded. Ni and Xu [Ni 85] pro-
`
`
`
`
`
`
`
`
`pose the “draft” algorithm for a homogeneous system.
`
`
`
`
`
`
`
`
`
`Wah and Juang [Wah 85] propose a window control
`
`
`
`
`
`
`
`
`algorithm to schedule the resource in local computer
`
`
`
`
`
`
`
`systems with a multi-access network. Wang and Morris
`
`
`
`
`
`
`
`[Wang 85] studied ten different algorithms for homoge-
`
`
`
`
`
`
`
`neous systems to evaluate the performance differences.
`
`
`
`
`
`
`
`
`
`Eager, et al. [Eage 86] addressed the problem of decen-
`
`
`
`
`
`
`
`tralized load sharing in a multiple system using- dynam-
`
`
`
`
`
`
`ic-state information. Eager discussed the appropriate
`
`
`
`
`
`
`
`level of complexity for
`load-sharing policies and
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`Page6of11
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`4
`
`
`
`
`
`
`
`showed that schemes that use relatively simple state
`
`
`
`
`
`
`
`
`information do very well and perform quite closely to
`
`
`
`
`
`
`the optimal expected performance. The system configu-
`
`
`
`
`
`
`
`
`
`ration studied by Eager, et al. was also a homogeneous
`
`
`
`
`
`
`
`
`
`system. Towsley and Lee [Tows 86] used the threshold
`
`
`
`
`
`
`
`
`
`
`
`of the local job queue length at each host to make deci-
`
`
`
`
`
`
`
`
`sions for remote processing. This computer system was
`generalized to be a heterogeneous system.
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`In summary, most of the work reported in the litera-
`ture has been limited to either static schemes, central-
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`ized control, homogeneous systems, or to two-proces-
`
`
`
`
`
`
`sor systems where overhead considerations were ig-
`
`
`
`
`
`
`
`
`nored. All of these approaches make assumptions that
`
`
`
`
`
`
`
`
`
`
`are too restricted to apply to most real computer system
`
`
`
`
`
`
`
`installations. The main contribution of this reported
`work is the development of a dynamic, decentralized,
`
`
`
`
`
`
`
`
`
`
`
`
`resource-sharing algorithm for a heterogeneous multi-
`
`
`
`
`
`
`
`
`ple (i.e. greater than two) processor system. Because it
`
`
`
`
`
`
`is server-initiated,
`this approach thus differs signifi-
`
`
`
`
`
`
`
`cantly from the sender-initiated approach described in
`
`
`
`
`
`
`
`
`[Tows 86]. The disadvantage of this prior art server-
`
`
`
`
`
`
`
`
`
`initiated approach is that it imposes extra overhead in
`the heavily-loaded situation and therefore,
`it could
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`bring the system to an unstable state.
`LIST OF CITED REFERENCES
`
`
`
`
`
`
`
`
`
`
`
`
`
`[Wah 85] Baumgartner, K. M. and Wah, B. W. “The
`
`Effects of Load Balancing on Response Time for Local
`
`
`
`
`
`
`
`
`
`Computer Systems with a Multiaccess Network,”
`
`
`
`
`
`
`IEEE International Comm.
`pp.
`Conf.
`1985,
`
`
`
`
`
`
`10.1.1-10.1.5.
`
`
`[Chow 79] Chow, Y. C. and Kohler, W. H. “Models
`
`
`
`
`
`
`
`
`
`for Dynamic Load Balancing in a Heterogeneous Multi-
`
`
`
`
`
`
`ple Processor System,” IEEE Trans. Computers, Vol.
`
`
`
`
`
`
`
`C-28, No. 5, pp. 345-361, May 1979.
`
`
`
`
`
`
`[Eage 86] Eager, D. L., Lazowska, E. D., and Zahor-
`
`
`
`
`
`
`
`
`jan, J. “Adaptive Load Sharing in Homogeneous Dis-
`
`
`
`
`
`
`tributed Systems,” IEEE Trans. on Software Eng., Vol.
`
`
`
`
`
`
`
`
`SE-l2, No. 5, May 1986.
`
`
`
`
`
`[Eage 85] Eager, D. L., Lazowska, E. D., and Zahor-
`
`
`
`
`
`
`
`jan, J. “A Comparison of Receiver-Initiative and Send-
`
`
`
`
`
`
`
`er-Initiative Dynamic Load Sharing,” Tech Report No.
`
`
`
`
`
`
`85-04-01, Dept. of Computer Science, University of
`
`
`
`
`
`
`
`Washington, April 1985.
`
`
`
`[Fisc 78] Foschini, G. J. and Salz, J . “A Basic Dy-
`
`
`
`
`
`
`
`
`
`
`namic Routing Problem with Diffusion,” IEEE Trans.
`
`
`
`
`
`
`
`Commun., Vol. Com-26, pp. 320-327, March 1978.
`
`
`
`
`
`
`
`[Huan 82] Hwang, K. and Wah, B. “A UNIX-Based
`
`
`
`
`
`
`
`
`Local Computer Network with Load Balancing,”
`
`
`
`
`
`IEEE Computer, April 1982.
`
`
`
`
`[Hwan 81] Hwang, K. and Ni, L. M. “Optimal Load
`
`
`
`
`
`
`
`
`
`Balancing Strategies for a Multiple Process System,”
`
`
`
`
`
`
`
`
`
`
`
`
`
`Proc. of Intel. Conf. on Parallel Processing, August
`1981.
`
`
`
`
`
`
`
`
`
`
`
`[Hwan 82] Hwang, K. and Croft, W. J., et al. “A
`UNIX-Based Local Computer Network with Load
`
`
`
`
`
`
`
`
`
`
`
`Balancing,” IEEE Computer magazine, April 1982.
`[Kari 85] Karian, Z. A. “GPSS/PC: Discrete-Event
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`Simulation on the IBM PC,” Byte, October 1985.
`
`
`
`
`
`
`
`[Klei 75] Kleirock, L. “Queueing System", Vol 1:
`Theory John Wiley & Sons, 1975.
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`[Kuro 86] Kurose, J. and Singh, S. “A Distributed
`
`
`
`
`
`
`
`Algorithm for Optimal Static Load Balancing in Dis-
`
`
`
`
`
`tributed Computer Systems,” IEEE Infocom, April
`1986.
`
`
`
`
`
`
`
`
`
`
`[Ni 81] Ni, L. M. and Hwang, K. “Optimal Load
`
`
`
`
`
`
`Balancing Strategies for a Multiple Processor System,"
`
`Page 6 of 11
`
`
`
`5,031,089
`
`
`
`
`
`
`
`
`
`
`
`
`IO
`
`
`
`15
`
`
`
`20
`
`
`25
`
`
`30
`
`
`35
`
`
`
`
`45
`
`
`
`5
`
`
`
`
`
`
`Proc. Tenth Int'l Conf. Parallel Processing, pp.
`
`
`
`352-257, August 1981.
`
`
`
`
`
`
`
`
`[Ni 85] Ni, L. M. “A Distributed Drafting Algorithm
`
`
`
`
`
`
`
`for Load Balancing,” IEEE Trans. on Software Eng.,
`Vol. SE-ll, No. l0, October 1985.
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`[Silv 84] Silva, E. S. and Gerla, M. “Load Balancing
`in Distributed Systems with Multiple Classes and Site
`
`
`
`
`
`
`
`
`Constraints”, Performance 84 (North Holland), pp.
`
`
`
`
`
`
`17-33, 1984.
`
`
`[Ston 78] Stone, H. S. “Critical Load Factors in Two-
`
`
`
`
`
`
`
`
`
`Processor Distributed Systems,” IEEE, Trans. of Soft-
`
`
`
`
`
`
`ware Engineering, Vol. SE-4, No. 3, pp. 254-258, May
`
`
`
`
`
`
`
`1978.
`
`[Tows 86] Towsley, D. and Lee, K. J. “A Compari-
`
`
`
`
`
`
`
`
`
`
`son of Priority-Based Decentralized Load Balancing
`
`
`
`
`
`
`Policies,” ACM Performance Evaluation Review, Vol.
`
`
`
`
`
`
`14, No. 1. PD- 70-77, May 1986.
`
`
`
`
`
`
`
`
`[Tows] Towsley, D. and Lee, K. J. “On the Analysis
`
`
`
`
`
`
`
`
`of a Decentralized Load Sharing Policy in Heteroge-
`
`
`
`
`
`
`neous Distributed Systems,” Submited to IEEE, Trans.
`
`
`
`
`
`
`
`
`
`of Computer.
`
`
`
`
`
`
`
`
`
`[Tant 85] Tantawi, A. N. and Towsley, D. “Optimal
`Static Load Balancing in Distributed Computer,”
`
`
`
`
`
`
`JACM, Vol. 32, No. 2, pp. 445-465, April 1985.
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`[Triv 82] Trivedi, S. “Probability and Statistics with
`
`
`
`
`
`Reliability, Queueing and Computer Science Applica-
`
`
`
`
`tions,” Prentice-Hall, Inc., 1982.
`-
`
`
`
`
`
`
`
`
`
`
`[Wah 85] Wah, B. and Lien,, Y. N. “Design of Dis-
`
`
`
`
`
`
`
`
`tributed Database on Local Computer Systems with a
`
`
`
`
`
`Multi-Access Network”, IEEE Trans. Software Engi-
`
`
`
`
`
`
`
`‘neering. Vol. SE-ll, No. 7, July 1985.
`
`
`
`
`
`
`
`
`
`[Wah 85] Wah, B. and Yuang, J. Y. “Resource Sched-
`
`
`
`
`
`
`
`uling for Local Computer Systems with a Multi-Access
`
`
`
`
`
`
`
`
`Network,” IEEE Trans. on Computers, Vol. C-34, No.
`
`
`
`12, December 1985.
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`[Wang 85} Wang, Y. T. and Morris, R. J. T. “Load
`
`
`
`
`
`
`
`
`Sharing in Distributed Systems,” IEEE Trans. on Com-
`
`
`
`
`
`
`
`puters, Vol. C-34, pp. 204-217, March 1985.
`
`
`
`
`
`
`
`
`The foregoing articles and reports from the literature
`
`
`
`
`
`
`
`are only generally relevant for background discussion
`
`
`
`
`
`
`
`
`
`purposes and, since copies are not readily available to
`
`
`
`
`
`
`
`
`applicants for filing herewith, they are not being pro-
`
`
`
`
`
`
`
`
`vided. In addition to the foregoing non-supplied articles
`
`
`
`
`
`
`
`
`from the literature, however, copies of the following
`
`
`
`
`
`
`
`relevant U.S. Letters Patent are being provided here-
`with:
`
`
`
`
`
`
`
`
`
`[1] Hoschler, H., Raimar, W., and Bandmaler, K.
`
`
`
`
`
`
`
`“Method of Operating a Data Processing System,” U.S.
`
`
`
`
`
`
`Pat. No. 4,099,235, July 4, 1978.
`
`
`
`
`
`
`
`[2] Kitajima, H. and Ohmachi, K. “Processing Re-
`
`
`
`
`
`
`
`
`quest Allocator for Assignment of Loads in a Distrib-
`
`
`
`
`
`
`
`
`uted Processing System,” U.S. Pat. No. 4,495,570, Jan.
`22, 1985.
`
`
`
`
`
`
`
`
`
`
`
`
`
`[3] Fry, S. M., Hempy, H. 0., and Kittinger, B. E.
`
`
`
`
`
`
`“Balancing Data-Processing Workloads”, U.S. Pat. No.
`
`
`
`
`4,403,286, Sept. 6, 1983.
`
`
`
`
`
`
`
`
`
`With respect to the above-listed U.S. Patents and the
`
`
`
`
`
`
`
`
`teaching thereof vis-a-vis the present invention to be
`described hereinafter,
`the inventors herein have in-
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`vented a new dynamic load-balancing scheme for a 60
`
`
`
`
`
`
`
`distributed computer system consisting of a number of
`
`
`
`
`
`
`
`heterogeneous hosts connected by a local area network
`
`
`
`
`
`
`(LAN). As mentioned above, numerous studies have
`
`
`
`
`
`
`
`addressed the problem of resource-sharing in distrib-
`
`
`
`
`
`
`
`uted systems. For purposes of discussion and distin-
`
`
`
`
`
`
`
`
`
`guishing, it is convenient to classify these strategies as
`
`
`
`
`
`
`
`
`
`
`being either static or dynamic in nature and as having
`
`
`
`
`
`
`either a centralized or decentralized decision-making
`
`50
`
`
`
`
`
`
`
`
`
`
`
`6
`
`
`
`
`
`
`
`capability. One can further distinguish the algorithms
`
`
`
`
`
`
`
`
`
`
`
`employed by the type of node that takes the initiative in
`
`
`
`
`
`
`the resource-sharing. Algorithms can be either sender-
`
`
`
`
`
`
`
`initiated or receiver-initiated. Some algorithms can be
`
`
`
`
`
`
`
`adapted to a generalized heterogeneous system while
`
`
`
`
`
`
`
`
`
`others can be used only in a homogeneous system.
`
`
`
`
`
`
`
`
`These categories are further addressed with respect to
`
`
`
`
`
`
`
`the above-referenced prior art patents as follows.
`
`
`
`
`Centralized/Decentralized: In a system with central-
`
`
`
`
`
`
`
`
`
`
`
`ized control (as shown in FIG. 1) jobs arrive at the
`
`
`
`
`
`
`
`
`central control computer 18 which is responsible for
`
`
`
`
`
`
`
`
`
`distributing the jobs among the network nodes. In a
`
`
`
`
`
`
`
`decentralized system, jobs are submitted to the individ-
`
`
`
`
`
`
`
`
`
`
`
`ual nodes and the decision to transfer a job to another
`
`
`
`
`
`
`
`
`
`node is made locally. The central dispatcher approach is
`
`
`
`
`
`
`
`
`quite restrictive for a distributed system. In the teach-
`
`
`
`
`
`
`
`
`
`ings of their patent, Kitajima, H. and Ohmachi assign a
`
`
`
`
`
`
`
`
`processing request allocator to be the single controller
`
`
`
`
`
`
`
`
`
`of their centralized scheme. One of the problems with
`
`
`
`
`
`
`
`
`
`this approach is the potentially high cost of obtaining
`the required state information. It is also possible for an
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`idle processor to acquire jobs from several processors
`and thus become overloaded.
`
`
`
`
`
`
`
`
`Homogeneous/Heterogeneous system: In a homoge-
`
`
`
`
`
`neous system, all computer nodes must be identical and
`
`
`
`
`
`
`
`
`
`have the same service rate. In the heterogeneous sys-
`
`
`
`
`
`
`
`
`tem, the computer nodes do not have the same process-
`
`
`
`
`
`
`
`
`
`ing power. In their patent, Fry, S. M., Hempy, H. 0.,
`
`
`
`
`
`
`
`
`
`
`and Kittinger, B. E. disclose a scheme to balance data-
`
`
`
`
`
`
`
`
`
`processing workloads on a homogeneous environment.
`
`
`
`
`
`Sender/Receiver Initiated: If the source node makes
`
`
`
`
`
`
`
`a determination as to where to route a job, this "is de-
`
`
`
`
`
`
`
`
`
`
`
`fined as a sender-initiated strategy. In receiver-initiated
`
`
`
`
`
`
`strategies, the situation is reversed, i.e., lightly-loaded
`
`
`
`
`
`
`nodes search for congested nodes from which work
`
`
`
`
`
`
`
`may be transferred. In their patent, Hoschler, H., Rai-
`
`
`
`
`
`
`
`
`mar, W., and Brandmaler disclose a sender-initiated
`
`
`
`
`
`
`
`scheme. The inventors herein have proved that the
`
`
`
`
`
`
`
`
`receiver-initiated approach is superior at medium to
`
`
`
`
`
`
`
`high loads and, therefore, have incorporated such an
`
`
`
`
`
`
`
`
`approach in their invention in a novel manner.
`
`
`
`
`
`
`
`
`Static/Dynamic: Static schemes use only the infor-
`
`
`
`
`
`
`mation about the long-term average behavior of the
`
`
`
`
`
`
`
`
`system,
`they ignore the current state. Dynamic
`i.e.
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`schemes differ from the static schemes by determining’
`how and when to transfer jobs based on the time-
`
`
`
`
`
`
`
`
`
`dependent current system state instead of the average
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`behavior. The major drawback of static algorithms is
`that they do not respond to fluctuations of the work-
`
`
`
`
`
`
`
`
`
`load. Dynamic schemes attempt to correct this draw-
`
`
`
`
`
`
`
`back.
`
`
`55
`
`
`65
`
`
`STATEMENT OF THE INVENTION
`
`
`
`
`
`
`
`
`
`
`
`
`Accordingly, an object of the invention is the provid-
`
`
`
`
`
`ing of a dynamic decentralized resource-sharing algo-
`
`
`
`
`
`
`rithm for a heterogeneous multi-processor system.
`
`
`
`
`
`
`
`
`
`
`It is another object of the invention to provide a
`
`
`
`
`
`dynamic decentralized resource-sharing algorithm for a
`
`
`
`
`
`heterogeneous multi-processor’ system which is receiv-
`
`
`
`
`
`
`
`
`
`
`er-initiated in heavy load so that it does not impose
`
`
`
`
`
`
`
`extra overhead in the heavily-loaded situation and,
`
`
`
`
`
`
`
`
`
`
`therefore, will not bring the system to an unstable state.
`
`
`
`
`
`
`
`
`
`Another object of the present invention is to prevent
`
`
`
`
`
`
`
`an idle node in a heterogeneous multi-processor system
`
`
`
`
`
`
`from becoming isolated from the resource-sharing pro-
`
`
`
`
`
`
`
`
`
`
`cess, as can happen with the systems of Fry and
`
`
`
`
`
`
`
`
`
`Kitajima by providing a wakeup timer used at each idle
`
`Page 7 of11
`
`Page 7 of 11
`
`
`
`
`7
`
`
`
`
`
`
`
`
`
`
`
`node to periodically cause the idle node to search for a
`
`
`
`
`
`
`
`
`
`job that can be transferred from a heavily-loaded node.
`
`
`
`
`
`
`
`
`
`
`Still another object of the present invention is to use
`
`
`
`
`
`
`
`
`
`
`
`the local queue length and the local service rate ratio at
`each node as a more efficient workload indicator.
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`It is yet a further object of the invention to provide a
`
`
`
`
`
`dynamic decentralized resource-sharing algorithm for a
`
`
`
`
`heterogeneous multi-processor system which dynami-
`
`
`
`
`
`
`
`
`cally adjusts to the traffic load and does not generate
`
`
`
`
`
`
`
`
`
`extra overhead during high traffic loading conditions
`
`
`
`
`
`
`
`
`
`