`
`US 8,572,138 B2
`Page 2
`
`(56)
`
`References Cited
`
`FOREIGN PATENT DOCUMENTS
`
`U.S. PATENT DOCUMENTS
`
`2003/0135509 Al*
`7/2003
`2003/0135658 Al*
`7/2003
`2003/0140282 Al
`7/2003
`9/2003
`2003/0177176 Al
`2003/0192035 Al* 10/2003
`5/2004
`2004/0088694 Al
`2004/0162741 Al
`8/2004
`2004/0181794 Al
`9/2004
`2004/0187104 Al
`9/2004
`2004/0230948 Al * 11/2004
`12/2004
`2004/0260734 Al
`2005/0005200 Al
`1/2005
`2005/0039180 Al*
`2/2005
`2005/0138370 Al*
`6/2005
`2005/0193265 Al
`9/2005
`2006/0173856 Al
`8/2006
`2006/0173857 Al
`8/2006
`2006/0173895 Al
`8/2006
`2006/0173984 Al
`8/2006
`2006/0173993 Al
`8/2006
`2006/0173994 Al
`8/2006
`2006/0174238 Al
`8/2006
`2006/0259292 Al* 11/2006
`2007/0168919 Al*
`7/2007
`2007/0169049 Al*
`7/2007
`
`Davis et al. ................... 707/100
`Haggar et al.
`................ 709/312
`Kaler et al.
`Hirschfeld et al.
`Duesterwald aid ........... 717/138
`Ho
`Flaxer et al.
`Coleman et al.
`Sardesai et al.
`Talwar et al.
`................. 717/114
`Ren et al.
`Matena et al.
`Fultheim et al.
`Goud et al.
`Lin et al.
`Jackson et al.
`Jackson
`Engquist et al.
`Emeis et al.
`Henseler et al.
`Emeis et al.
`Henseler et al.
`.......... 703/27
`Solomon et al.
`Henseler et al. .............. 717/101
`Gingell et al. ................ 717/151
`
`............ 718/1
`713/164
`
`WO
`WO
`WO
`WO
`WO
`
`WO/2006/083727 Al
`WO/2006/083893 Al
`WO/2006/083894 Al
`WO/2006/083895 Al
`WO/2006/083901 Al
`
`8/2006
`8/2006
`8/2006
`8/2006
`8/2006
`
`OTHER PUBLICATIONS
`
`B. Urgaonkar et al., "Resource Overbooking and Application Profil(cid:173)
`ing in Shared Hosting Platforms," Proceedings of the 5th Symposium
`on Operating Systems Design and Implementation, 17 pgs.,
`XP-002387427, 2002 .
`G. Lodi et al., "QoS-aware Clustering of Application Servers," Pro(cid:173)
`ceedings of the !51 IEEE Workshop on Quality of Service for Appli(cid:173)
`cation Servers, In Conjunction With the 23 rd International Sympo(cid:173)
`sium on Reliable Distributed Systems, 6 pages, XP-002383792, Oct .
`17, 2004.
`Preinstalling Microsoft Windows XP by Using the OEM Preinstalla(cid:173)
`tion Kit, Part 1, XP-002301441, Apr. 4, 2003, 24 pages.
`R. Mark Koan et al., it Takes a Village to Build an image,
`XP-002384269, 2003, pp. 200-207.
`U.S. Appl. No. 11/607,819 entitled, "Automated Deployment and
`Configuration of Applications in an Autonomically Controlled Dis(cid:173)
`tributed Computing System", filed Dec. 1, 2006.
`U.S. Appl. No. 11/607,820 entitled, "Automated Deployment and
`Configuration of Applications in an Autonomically Controlled Dis(cid:173)
`tributed Computing System", filed Dec. 1, 2006.
`
`* cited by examiner
`
`Netflix, Inc. - Ex. 1001, Page 000002
`
`
`
`
`
`U.S. Patent
`
`Oct. 29, 2013
`
`Sheet 2 of 29
`
`US 8,572,138 B2
`
`r .....
`
`N
`
`N
`(!)
`LL
`
`Netflix, Inc. - Ex. 1001, Page 000004
`
`
`
`U.S. Patent
`
`Oct. 29, 2013
`
`Sheet 3 of 29
`
`US 8,572,138 B2
`
`---+
`
`RECEIVE INPUT DEFINING
`HIERARCHY OF DISTRIBUTED
`COMPUTING SYSTEM
`
`V 50
`
`1
`
`RECEIVE INPUT IDENTIFYING
`NODE REQUIREMENTS OF TIERS
`
`V 52
`
`l
`
`SELECT NODES FROM FREE
`POOL OR LOWER PRIORITY TIER
`
`V 54
`
`1
`
`-
`
`DYNAMICALLY ASSIGN NODES TO
`NODE SLOTS OF THE DEFINED
`TIERS
`
`56
`
`V
`
`FIG. 3
`
`Netflix, Inc. - Ex. 1001, Page 000005
`
`
`
`
`
`
`
`U.S. Patent
`
`Oct. 29, 2013
`
`Sheet 6 of 29
`
`US 8,572,138 B2
`
`110 '
`
`IDENTIFY EXCESS NODE
`CAPACITY ON TIER
`
`...
`
`~
`
`112 '
`
`l
`
`CALCULATE ENERGY OF ALL THE
`NODES IN THE TIER
`
`114 '
`
`SELECT NODE WITH THE HIGHEST
`ENERGY (FURTHEST MATCH
`FROM TIER'S IDEAL NODE)
`
`l
`
`l
`
`RETURN SELECTED NODE TO THE
`FREE POOL
`
`116 '
`
`FIG. 6
`
`Netflix, Inc. - Ex. 1001, Page 000008
`
`
`
`U.S. Patent
`
`Oct. 29, 2013
`
`Sheet 7 of 29
`
`US 8,572,138 B2
`
`N
`
`N ....
`
`(cid:127)
`
`(!)
`LL
`
`Netflix, Inc. - Ex. 1001, Page 000009
`
`
`
`U.S. Patent
`
`Oct. 29, 2013
`
`Sheet 8 of 29
`
`US 8,572,138 B2
`
`00
`
`C) -LL
`
`Netflix, Inc. - Ex. 1001, Page 000010
`
`
`
`U.S. Patent
`
`Oct. 29, 2013
`
`Sheet 9 of 29
`
`US 8,572,138 B2
`
`a,
`
`(!) -LL
`
`Netflix, Inc. - Ex. 1001, Page 000011
`
`
`
`
`
`U.S. Patent
`
`Oct. 29, 2013
`
`Sheet 11 of 29
`
`US 8,572,138 B2
`
`(!)
`LL
`
`Netflix, Inc. - Ex. 1001, Page 000013
`
`
`
`U.S. Patent
`
`Oct. 29, 2013
`
`Sheet 12 of 29
`
`US 8,572,138 B2
`
`C)
`LL
`
`Netflix, Inc. - Ex. 1001, Page 000014
`
`
`
`
`
`U.S. Patent
`
`Oct. 29, 2013
`
`Sheet 14 of 29
`
`US 8,572,138 B2
`
`(!)
`L1.
`
`Netflix, Inc. - Ex. 1001, Page 000016
`
`
`
`00 = N
`
`"'""' w
`"'N
`-....l
`tit
`00
`d r.,;_
`
`0 ....
`Ul
`....
`.....
`r,J =(cid:173)
`
`('D
`('D
`
`\,Ci
`N
`
`-
`
`~
`
`0 ....
`
`N
`'-"\,Ci
`N
`:-+-
`0
`
`(')
`
`~ = ~
`
`~
`~
`~
`
`00 .
`~
`
`FABRIC ACTIONS
`
`I n 207
`
`,--211
`
`'---210
`STATE
`
`-
`
`-
`-
`
`INITIAL EXPECTED
`
`t
`
`204
`
`INFRASTRUCTURE
`
`AUTOMATION
`SERVICE LEVEL
`
`I
`
`FIG. 15
`
`STATUS DATA
`
`203
`
`t
`
`'---208
`
`-
`
`ACTUAL STATE
`
`202
`
`SUBSYSTEM
`MONITORING
`
`11'
`
`206
`TIER
`
`BUSINESS LOGIC
`
`•
`
`,212
`
`ACTION REQUESTS
`
`214
`
`INITIAL MONITORING INFORMATION
`
`CONTROL NODE
`
`12
`
`Netflix, Inc. - Ex. 1001, Page 000017
`
`
`
`
`
`
`
`U.S. Patent
`
`Oct. 29, 2013
`
`Sheet 18 of 29
`
`US 8,572,138 B2
`
`WORKING MEMORY
`
`y-270
`
`EXPECTED STATE (READ ONLY)
`MAX NODES: 5
`
`v--272
`
`I
`
`BLT
`
`ACTUAL STATE (READ/WRITE)
`
`v-274
`
`5 MINUTE LOAD AVERAGE: 2.4.
`
`SENSOR/
`RULE
`ENGINES
`
`I
`
`LOCAL OBJECT (READ/WRITE)
`
`v--276
`
`SENSOR/
`RULE
`ENGINES
`
`I
`
`FIG. 18
`
`Netflix, Inc. - Ex. 1001, Page 000020
`
`
`
`00 = N
`
`"'""' w
`"'N
`-....l
`tit
`00
`d r.,;_
`
`('D
`('D
`
`~
`
`0 ....
`
`N
`'-"\,Ci
`N
`:-+-
`0
`
`(')
`
`~ = ~
`
`~
`~
`~
`
`00 .
`~
`
`0 ....
`....
`.....
`r,J =(cid:173)
`
`\,Ci
`N
`
`\,Ci
`
`J
`
`TASK DATA
`
`310
`
`,
`
`J
`
`312
`
`TASKS MANAGER
`
`(
`
`302
`
`REPORT GENERATOR
`
`I
`
`FIG. 19
`
`REPORTS
`
`314
`
`INTERFACE
`
`311
`
`TASKS
`
`-
`
`'---212
`REQUESTS
`
`ACTION
`
`204
`SLAI
`
`308
`
`SERVICE
`
`ADMINISTRATION
`
`USER
`
`306
`
`SERVICE
`
`FABRIC VIEW
`
`304
`
`SERVICE
`
`ADMINISTRATION
`
`FABRIC
`
`ill
`
`SUBSYSTEM
`
`EVENT
`
`WEB SERVICE DEFINITION LANGUAGE (WSDL) INTERFACES
`
`300
`
`'.
`
`
`-~
`
`CLIENTS
`
`313
`
`WEB SERVICE
`
`BUSINESS LOGIC TIER
`
`206
`
`Netflix, Inc. - Ex. 1001, Page 000021
`
`
`
`
`
`
`
`0 ....
`
`N
`N
`.....
`r,J =(cid:173)
`
`('D
`('D
`
`\,Ci
`N
`
`~
`
`0 ....
`
`N
`'-"\,Ci
`N
`:-+-
`0
`
`(')
`
`~ = ~
`
`~
`~
`~
`
`r:J)_ .
`~
`
`period="60"
`minThreshold="1"
`maxThreshold="-1"
`frequency=" 15"
`*100"
`"PercentMemoryFree
`expression=
`"Free Memory"
`attribute=
`
`period="15"
`minThreshold="-1"
`maxThreshold="4"
`frequency=" 15"
`"Load5Average"
`expression=
`"LoadAverage"
`attribute=
`
`period="30"
`minThreshold="5"
`maxThreshold="20"
`frequency=" 15"
`"Pending Requests"
`expression=
`"Pending Requests"
`attribute=
`
`period="30"
`minThreshold="5.0"
`0"
`maxThreshold="20.
`frequency=" 15"
`Count"
`Execute ThreadTotal
`ount)*100/
`Execute Thread Id leC
`ICount-
`"(Execute ThreadTota
`expression=
`Percentage"
`"BusyThread-
`attribute=
`
`TotalCount
`Execute Thread-
`name=
`
`ldleCount
`Execute Thread-
`name=
`
`Pend Requests
`name=
`
`Memory
`PercentFree-
`name=
`
`Load5Average
`name=
`
`data.ear"
`domains/
`user_projects/
`path="/
`name="App"
`
`DBMS_ADK.ear"
`DK" path="/lib/
`name="DBMS A
`
`worklist.ear"
`worklist/
`path="/lib/
`Interface"
`Worker User
`name="Worklist
`
`ejbs.ear"
`EJBs" path="/lib/
`name="System
`
`designtime.ear"
`path="/lib/ai-
`Design-time"
`name="AI
`
`y=120
`expectedStartupDela managedServer_ 1
`
`Name=
`172.31.64.202
`IP=
`
`managedServer_0
`Name=
`172.31.64.201
`IP=
`
`"PathCluster"
`clusterName=
`
`"WebAdmin"
`adminTier=
`
`"1100"
`adminPort=
`
`"172.31.64.201"
`adminlP=
`
`Data Domain
`Name=
`
`CONSTRAINTS
`DEPLOYMENT
`
`LEVELS
`SERVICE
`
`ATTRIBUTES
`MONITORED
`
`SERVICE
`
`VALUES
`
`MONITORED
`
`SERVICES
`
`NODE
`
`NODES
`
`APPLICATION
`
`APPLICATION
`
`362
`
`360
`
`"
`
`......
`Portal-Domain
`name=
`
`minNodes=1
`
`maxNodes=2
`
`load Delay= 120
`
`00 = N
`
`"'""' w
`"'N
`-....l
`tit
`00
`d r.,;_
`
`374
`
`I
`
`\
`
`376
`
`"-
`
`FIG. 22
`
`~
`
`Netflix, Inc. - Ex. 1001, Page 000024
`
`
`
`U.S. Patent
`
`Oct. 29, 2013
`
`Sheet 23 of 29
`
`US 8,572,138 B2
`
`V 380
`
`V 382
`
`Ir 384
`
`V 386
`
`v-- 388
`
`v-- 390
`
`v- 392
`
`v-- 394
`
`STAGE APPLICATION WITHIN STAGING
`ENVIRONMENT AND GENERATE
`APPLICATION DEFINITION
`
`+
`
`DEFINE APPLICATION CONFIGURATION
`PROPERTIES
`
`GENERATE APPLICATION ENTRY USING
`APP. DEFINITION AND APP.
`CONFIGURATION PROPERTIES
`
`+
`
`+
`
`MODIFY APPLICATION ENTRY
`
`+
`
`INSERT APPLICATION ENTRY INTO
`APPLICATION MATRIX
`
`+
`
`APPLICATION MATRIX ALERTS
`CONFIGURATION PROCESSOR
`
`CONFIGURATION PROCESSOR UPDATES
`APPLICATION RULE ENGINE AND
`MONITORING SUBSYSTEM
`
`+
`
`+
`
`CONTROL NODE HAS AUTONOMIC
`CONTROL OVER APPLICATION
`
`FIG. 23
`
`Netflix, Inc. - Ex. 1001, Page 000025
`
`
`
`U.S. Patent
`
`Oct. 29, 2013
`
`Sheet 24 of 29
`
`US 8,572,138 B2
`
`,-408A
`
`,-408B
`
`,-408N
`
`OS/APP.
`406N
`
`~·
`
`VIRTUAL MACHINE
`404N
`
`----
`
`OS/APP.
`406A
`
`OS/APP.
`406B
`
`VIRTUAL MACHINE
`404A
`
`~·
`
`----
`
`~·
`
`VIRTUAL MACHINE
`404B
`"
`_\.
`
`•••
`
`VIRTUAL MACHINE MANAGER
`402
`
`~
`
`PHYSICAL NODE
`400
`
`FIG. 24
`
`Netflix, Inc. - Ex. 1001, Page 000026
`
`
`
`U.S. Patent
`
`Oct. 29, 2013
`
`Sheet 25 of 29
`
`US 8,572,138 B2
`
`,,- 410
`
`- 412
`
`,,- 414
`
`- 416
`
`,,- 418
`
`SET ASIDE A NODE AS AN
`IMAGE HOST
`
`+
`
`INSTALL VIRTUAL MACHINE
`MANAGER ON HOST
`
`CONFIGURE VIRTUAL MACHINE
`MANAGER WITH DESIRED
`NUMBER OF VM INSTANCES
`
`+
`
`+
`
`PERFORM NETWORK OR SAN
`CONFIGURATION FOR NODE
`
`+
`
`CAPTURE NODE WITH THE
`CONTROL NODE, SPECIFY IP
`ADDRESS, AND SPECIFY
`CAPTURE OF A VIRTUAL
`MACHINE MANAGER
`
`FIG. 25
`
`Netflix, Inc. - Ex. 1001, Page 000027
`
`
`
`U.S. Patent
`
`Oct. 29, 2013
`
`Sheet 26 of 29
`
`US 8,572,138 B2
`
`r 420
`
`r 422
`
`r 424
`
`r 426
`
`r 428
`
`r 430
`
`r 432
`
`SET ASIDE A NODE THAT 15
`RUNNING VIRTUAL MACHINE
`MANAGER
`
`+
`
`CREATE A VIRTUAL MACHINE
`ON THE NODE
`
`INSTALL AN OPERATING
`SYSTEM ON THE VIRTUAL
`MACHINE
`
`+
`
`+
`
`INSTALL APPLICATIONS ON
`THE OPERATING SYSTEM
`
`PERFORM ANY NEEDED
`CONFIGURATION TO 05 OR
`APPLICATONS
`
`+
`
`+
`
`SNAPSHOT THE VIRTUAL DISK
`OF THE VIRTUAL MACHINE
`
`+
`
`CAPTURE THE VIRTUAL DISK
`TO THE CONTROL NODE
`
`FIG. 26
`
`Netflix, Inc. - Ex. 1001, Page 000028
`
`
`
`U.S. Patent
`
`Oct. 29, 2013
`
`Sheet 27 of 29
`
`US 8,572,138 B2
`
`V 440
`
`CREATE NEW TIER
`
`CREATE NEW TIER
`
`SPECIFY CAPTURED VIRTUAL
`MACHINE MANAGER IMAGE
`FOR NEW TIER
`
`+
`
`+
`
`v- 442
`
`V 444
`
`SPECIFY CAPTURED OS/
`APPLICATION IMAGE FOR NEW
`TIER
`
`+
`
`+
`
`ACTIVATE TIER
`
`SET TARGETS FOR TIER
`
`+
`
`ACTIVATE TIER
`
`FIG. 27A
`
`FIG. 27B
`
`446
`
`- 448
`
`450
`
`- 452
`
`Netflix, Inc. - Ex. 1001, Page 000029
`
`
`
`U.S. Patent
`
`Oct. 29, 2013
`
`Sheet 28 of 29
`
`US 8,572,138 B2
`
`FIG. 28
`
`Netflix, Inc. - Ex. 1001, Page 000030
`
`
`
`
`
`US 8,572,138 B2
`
`1
`DISTRIBUTED COMPUTING SYSTEM
`HAVING AUTONOMIC DEPLOYMENT OF
`VIRTUAL MACHINE DISK IMAGES
`
`This application claims the benefit of U.S. provisional
`Application Ser. No. 60/787,280, filed Mar. 30, 2006, the
`entire content of which is incorporated herein by reference.
`
`TECHNICAL FIELD
`
`The invention relates to computing environments and,
`more specifically, to distributed computing systems.
`
`BACKGROUND
`
`Distributed computing systems are increasingly being uti(cid:173)
`lized to support business as well as technical applications.
`'lypically, distributed computing systems are constructed
`from a collection of computing nodes that combine to provide
`a set of processing services to implement the distributed
`computing applications. Each of the computing nodes in the
`distributed computing system is typically a separate, indepen(cid:173)
`dent computing device interconnected with each of the other
`computing nodes via a communications medium, e.g., a net(cid:173)
`work.
`One challenge with distributed computing systems is the
`organization, deployment and administration of such a sys(cid:173)
`tem within an enterprise environment. For example, it is often
`difficult to manage the allocation and deployment of enter(cid:173)
`prise computing functions within the distributed computing
`system. An enterprise, for example, often includes several
`business groups, and each group may have competing and
`variable computing requirements.
`
`SUMMARY
`
`15
`
`2
`deployment of a set of applications within a distributed com(cid:173)
`puting system, and performing operations to provide auto(cid:173)
`nomic control over the deployment of the applications within
`the distributed computing system in accordance with param(cid:173)
`eters of the application matrix.
`In another embodiment, a computer-readable medium
`comprises instructions. The instructions cause the processor
`to generate an application matrix that specifies data for con(cid:173)
`trolling the deployment of a set of applications within a dis-
`IO tributed computing system, and perform operations to pro(cid:173)
`vide autonomic control over the deployment of the
`applications within the distributed computing system in
`accordance with parameters of the application matrix.
`In another embodiment, a distributed computing system
`comprises a plurality of application nodes interconnected via
`a communications network. In addition, the distributed com(cid:173)
`puting system includes a software image repository storing:
`(i) one or more image instances of a virtual machine manager
`20 that is executable on the application nodes, wherein when
`executed on the applications nodes, the image instances of the
`virtual machine manager provide one or more virtual
`machines, and (ii) one or more image instances of one or more
`software applications that are executable on the virtual
`25 machines. The distributed computing system also includes a
`control node that comprises an automation infrastructure to
`provide autonomic deployment of the image instances of the
`virtual machine manager on the application nodes and to
`provide autonomic deployment of the image instances of the
`30 software applications on the virtual machines.
`The details of one or more embodiments of the invention
`are set forth in the accompanying drawings and the descrip(cid:173)
`tion below. Other features, objects, and advantages of the
`35 invention will be apparent from the description and drawings,
`and from the claims.
`
`In general, the invention is directed to a distributed com(cid:173)
`puting system that conforms to a multi-level, hierarchical
`organizational model. One or more control nodes provide for
`the efficient and automated allocation and management of 40
`computing functions and resources within the distributed
`computing system in accordance with the organization
`model.
`As described herein, the model includes four distinct lev(cid:173)
`els: fabric, domains, tiers and nodes that provide for the 45
`logical abstraction and containment of the physical compo(cid:173)
`nents as well as system and service application software of the
`enterprise. A user, such as a system administrator, interacts
`with the control nodes to logically define the hierarchical
`organization of the distributed computing system. The con- 50
`trol nodes are responsible for all levels of management in
`accordance with the model, including fabric management,
`domain creation, tier creation and node allocation and
`deployment.
`In one embodiment, a distributed computing system com- 55
`prises a plurality of application nodes interconnected via a
`communications network and a control node. The control
`node comprises a set of one or more applications to be
`executed on the application nodes, an application matrix that
`includes parameters for controlling the deployment of the 60
`applications within the distributed computing system, and an
`automation unit having one or more rule engines that provide
`autonomic control of the application nodes and the applica(cid:173)
`tions in accordance with a set of one or more rules and the
`application matrix.
`In another embodiment, a method comprises generating an
`application matrix that specifies data for controlling the
`
`BRIEF DESCRIPTION OF DRAWINGS
`
`FIG. 1 is a block diagram illustrating a distributed comput(cid:173)
`ing system constructed from a collection of computing nodes.
`FIG. 2 is a schematic diagram illustrating an example of a
`model of an enterprise that logically defines an enterprise
`fabric.
`FIG. 3 is a flow diagram that provides a high-level over(cid:173)
`view of the operation of a control node when configuring the
`distributed computing system.
`FIG. 4 is a flow diagram illustrating exemplary operation of
`the control node when assigning computing nodes to node
`slots of tiers.
`FIG. 5 is a flow diagram illustrating exemplary operation of
`a control node when adding an additional computing node to
`a tier to meet additional processing demands.
`FIG. 6 is a flow diagram illustrating exemplary operation of
`a control node harvesting excess node capacity from one of
`the tiers and returning the harvested computing node to the
`free pool.
`FIG. 7 is a screen illustration of an exemplary user interface
`for defining tiers in a particular domain.
`FIG. 8 is a screen illustration of an exemplary user interface
`for defining properties of the tiers.
`FIG. 9 is a screen illustration of an exemplary user interface
`for viewing and identify properties of a computing node.
`FIG. 10 is a screen illustration of an exemplary user inter-
`65 face for viewing software images.
`FIG. 11 is a screen illustration of an exemplary user inter(cid:173)
`face for viewing a hardware inventory report.
`
`Netflix, Inc. - Ex. 1001, Page 000032
`
`
`
`3
`FIG. 12 is a screen illustration of an exemplary user inter(cid:173)
`face for viewing discovered nodes that are located in the free
`pool.
`FIG. 13 is a screen illustration of an exemplary user inter(cid:173)
`face for viewing users of a distributed computing system.
`FIG. 14 is a screen illustration of an exemplary user inter(cid:173)
`face for viewing alerts for the distributed computing system.
`FIG. 15 is a block diagram illustrating one embodiment of
`control node that includes a monitoring subsystem, a service
`level automation infrastructure (SLAI), and a business logic 10
`tier (BLT).
`FIG. 16 is a block diagram illustrating one embodiment of
`the monitoring subsystem.
`FIG. 17 is a block diagram illustrating one embodiment of
`the SLAI in further detail.
`FIG. 18 is a block diagram of an example working memory
`associated with rule engines of the SLAI.
`FIG. 19 is a block diagram illustrating an example embodi(cid:173)
`ment for the BLT of the control node.
`FIG. 20 is a block diagram illustrating one embodiment of
`a rule engine in further detail.
`FIG. 21 is a block diagram illustrating another example
`embodiment of the control node.
`FIG. 22 is a flowchart illustrating an exemplary mode of
`initiating a control node utilizing an application matrix.
`FIG. 23 is a block diagram illustrating an exemplary appli(cid:173)
`cation matrix.
`FIG. 24 is a block diagram illustrating an exemplary con(cid:173)
`figuration of a physical node to which a virtual machine
`manager and one or more virtual machines are deployed.
`FIG. 25 is a flowchart illustrating an exemplary procedure
`for creating and capturing an image of a virtual machine
`manager for autonomic deployment within the distributed
`computing system.
`FIG. 26 is a flowchart illustrating an exemplary procedure 35
`for creating and capturing a virtual disk image file for auto(cid:173)
`nomic deployment within the distributed computing system.
`FIG. 27A is a flowchart illustrating an exemplary proce(cid:173)
`dure for deploying a virtual machine manager to a node
`within the distributed computing system.
`FIG. 27B is a flowchart illustrating an exemplary proce(cid:173)
`dure for deploying an operating system and application image
`to a node that is executing a virtual machine manager within
`the distributed computing system.
`FIG. 28 is a screen image illustrating an exemplary tier 45
`control interface for the embodiment described in FIGS. 24
`through 27.
`FIG. 29 is a screen image illustrating an exemplary node
`control interface for the embodiment described in FIGS. 24
`through 27.
`
`DETAILED DESCRIPTION
`
`FIG. 1 is a block diagram illustrating a distributed comput(cid:173)
`ing system 10 constructed from a collection of computing
`nodes. Distributed computing system 10 may be viewed as a
`collection of computing nodes operating in cooperation with
`each other to provide distributed processing.
`In the illustrated example, the collection of computing
`nodes forming distributed computing system 10 are logically 60
`grouped within a discovered pool 11, a free pool 13, an
`allocated tiers 15 and a maintenance pool 17. In addition,
`distributed computing system 10 includes at least one control
`node 12.
`Within distributed computing system 10, a computing 65
`node refers to the physical computing device. The number of
`computing nodes needed within distributed computing sys-
`
`US 8,572,138 B2
`
`4
`tern 10 is dependent on the processing requirements. For
`example, distributed computing system 10 may include 8 to
`512 computing nodes or more. Each computing node includes
`one or more progranimable processors for executing software
`instructions stored on one or more computer-readable media.
`Discovered pool 11 includes a set of discovered nodes that
`have been automatically "discovered" within distributed
`computing system 10 by control node 12. For example, con(cid:173)
`trol node 12 may monitor dynamic host communication pro(cid:173)
`tocol (DHCP) leases to discover the connection of a node to
`network 18. Once detected, control node 12 automatically
`inventories the attributes for the discovered node and reas(cid:173)
`signs the discovered node to free pool 13. The node attributes
`identified during the inventory process may include a CPU
`15 count, a CPU speed, an amount of memory ( e.g., RAM), local
`disk characteristics or other computing resources. Control
`node 12 may also receive input identifying node attributes not
`detectable via the automatic inventory, such as whether the
`node includes I/O, such as HBA. Further details with respect
`20 to the automated discovery and inventory processes are
`described in U.S. patent application Ser. No. 11/070,851,
`entitled "AUTOMATED DISCOVERY AND INVENTORY
`OF NODES WITHIN AN AUTONOMIC DISTRIBUTED
`COMPUTING SYSTEM," filed Mar. 2, 2005, the entire con-
`25 tent of which is hereby incorporated by reference.
`Free pool 13 includes a set of unallocated nodes that are
`available for use within distributed computing system 10.
`Control node 12 may dynamically reallocate an unallocated
`node from free pool 13 to allocated tiers 15 as an application
`30 node 14. For example, control node 12 may use unallocated
`nodes from free pool 13 to replace a failed application node
`14 or to add an application node to allocated tiers 15 to
`increase processing capacity of distributed computing system
`10.
`In general, allocated tiers 15 include one or more tiers of
`application nodes 14 that are currently providing a computing
`environment for execution of user software applications. In
`addition, although not illustrated separately, application
`nodes 14 may include one or more input/output (I/O) nodes.
`40 Application nodes 14 typically have more substantial I/O
`capabilities than control node 12, and are typically configured
`with more computing resources (e.g., processors and
`memory). Maintenance pool 17 includes a set of nodes that
`either could not be inventoried or that failed and have been
`taken out of service from allocated tiers 15.
`Control node 12 provides the system support functions for
`managing distributed computing system 10. More specifi(cid:173)
`cally, control node 12 manages the roles of each computing
`node within distributed computing system 10 and the execu-
`50 tion of software applications within the distributed comput(cid:173)
`ing system. In general, distributed computing system 10
`includes at least one control node 12, but may utilize addi(cid:173)
`tional control nodes to assist with the management functions.
`Other control nodes 12 (not shown in FIG. 1) are optional
`55 and may be associated with a different subset of the comput(cid:173)
`ing nodes within distributed computing system 10. Moreover,
`control node 12 may be replicated to provide primary and
`backup administration functions, thereby allowing for grace-
`ful handling a failover in the event control node 12 fails.
`Network 18 provides a communications interconnect for
`control node 12 and application nodes 14, as well as discov(cid:173)
`ered nodes, unallocated nodes and failed nodes. Communi(cid:173)
`cations network 18 permits internode communications
`among the computing nodes as the nodes perform interrelated
`operations and functions. Communications network 18 may
`comprise, for example, direct connections between one or
`more of the computing nodes, one or more customer networks
`
`Netflix, Inc. - Ex. 1001, Page 000033
`
`
`
`US 8,572,138 B2
`
`5
`maintained by an enterprise, local area networks (LAN s ),
`wide area networks (WANs) or a combination thereof. Com(cid:173)
`munications network 18 may include a number of switches,
`routers, firewalls, load balancers, and the like.
`In one embodiment, each of the computing nodes within
`distributed computing system 10 executes a common general(cid:173)
`purpose operating system. One example of a general-purpose
`operating system is the Windows™ operating system pro(cid:173)
`vided by Microsoft Corporation. In some embodiments, the
`general-purpose operating system such as the Linux kernel 10
`maybe used.
`In the example of FIG. 1, control node 12 is responsible for
`software image management. The term "software image"
`refers to a complete set of software loaded on an individual
`computing node to provide an execution environment for one 15
`or more applications. The software image including the oper(cid:173)
`ating system and all boot code, middleware files, and may
`include application files. As described below embodiments of
`the invention provide application-level autonomic control
`over the deployment, execution and monitoring of applica- 20
`tions onto software images associated with application nodes
`14.
`System administrator 20 may interact with control node 12
`and identify the particular types of software images to be
`associated with application nodes 14. Alternatively, adminis(cid:173)
`tration software executing on control node 12 may automati(cid:173)
`cally identify the appropriate software images to be deployed
`to application nodes 14 based on the input received from
`system administrator 20. For example, control node 12 may
`determine the type of software image to load onto an appli(cid:173)
`cation node 14 based on the functions assigned to the node by
`system administrator 20. Application nodes 14 may be
`divided into a number of groups based on their assigned
`functionality. As one example, application nodes 14 may be
`divided into a first group to provide web server functions, a
`second group to provide business application functions and a
`third group to provide database functions. The application
`nodes 14 of each group may be associated with different
`software images.
`Control node 12 provides for the efficient allocation and 40
`management of the various software images within distrib(cid:173)
`uted computing system 10. In some embodiments, control
`node 12 generates a "golden image" for each type of software
`image that may be deployed on one or more of application
`nodes 14.As described herein, the term "golden image" refers 45
`to a reference copy of a complete software stack for providing
`an execution environment for applications.
`System administrator 20 may create a golden image by
`installing an operating system, middleware and software
`applications on a computing node and then making a com- 50
`plete copy of the installed software. In this manner, a golden
`image may be viewed as a "master copy" of the software
`image for a particular computing function. Control node 12
`maintains a software image repository 26 that stores the
`golden images associated with distributed computing system 55
`10.
`Control node 12 may create a copy of a golden image,
`referred to as an "image instance," for each possible image
`instance that may be deployed within distributed computing
`system 10 for a similar computing function. In other words,
`control node 12 pre-generates a set ofK image instances for
`a golden image, where K represents the maximum number of
`image instances for which distributed computing system 10 is
`configured for the particular type of computing function. For
`a given computing function, control node 12 may create the 65
`complete set of image instance even if not all of the image
`instances will be initially deployed. Control node 12 creates
`
`6
`different sets of image instances for different computing
`functions, and each set may have a different number of image
`instances depending on the maximum number of image
`instances that may be deployed for each set. Control node 12
`stores the image instances within software image repository
`26. Each image instance represents a collection of bits that
`may be deployed on an application node.
`Further details of software image management are
`described in co-pending U.S. patent application Ser. No.
`11/046,133, entitled "MANAGEMENT OF SOFTWARE
`IMAGES FOR COMPUTING NODES OF A DISTRIB-
`UTED COMPUTING SYSTEM," filed Jan. 28, 2005 and
`co-pending U.S. patent application Ser. No. 11/046,152,
`entitled "UPDATING SOFTWARE IMAGES ASSOCI-
`ATED WITH A DISTRIBUTED COMPUTING SYSTEM,"
`filed Jan. 28, 2005, each of which is incorporated herein by
`reference in its entirety.
`In general, distributed computing system 10 conforms to a
`multi-level, hierarchical organizational model that includes
`four distinct levels: fabric, domains, tiers and nodes. Control
`node 12 is responsible for all levels of management, including
`fabric management, domain creation, tier creation and node
`allocation and deployment.
`As used herein, the "fabric" level generally refers to the
`25 logical constructs that allow for definition, deployment, par(cid:173)
`titioning and management of distinct enterprise applications.
`In other words, fabric refers to the integrated set of hardware,
`system software and application software that can be "knit(cid:173)
`ted" together to form a complete enterprise system. In gen-
`30 era!, the fabric level consists of two elements: fabric compo(cid:173)
`nents or fabric payload. Control node 12 provides fabric
`management and fabric services as described herein.
`In contrast, a "domain" is a logical abstraction for contain(cid:173)
`ment and management within the fabric. The domain pro-
`35 vides a logical unit of fabric allocation that enables the fabric
`to be partitioned amongst multiple uses, e.g. different busi-
`ness services.
`Domains are comprised of tiers, such as a 4-tier application
`model (web server, application server, business logic, persis(cid:173)
`tence layer) or a single tier monolithic application. Fabric
`domains contain the free pool of devices available for assign-
`ment to tiers.
`A tier is a logically associated group of fabric components
`within a domain that share a set of attributes: usage, availabil(cid:173)
`ity model or business service mission. Tiers are used to define
`structure within a domain e.g. N-tier application, and each tier
`represents a different computing function. A user, such as
`administrator 20, typically defines the tier structure within a
`domain. The hierarchical architecture may provide a high
`degree of flexibility in mapping customer applications to
`logical models which run within the fabric environment. The
`tier is one construct in this modeling process and is the logical
`container of application resources.
`The lowest level, the node level, includes the physical
`components of the fabric. This includes computing nodes
`that, as described above, provide operating environments for
`system applications and enterprise software applications. In
`addition, the node level may include network devices (e.g.,
`Ethernet switches, load balancers and firewalls) used in cre-
`60 ating the infrastructure of network 18. The node level may
`further include network storage nodes that are network con(cid:173)
`nected to the fabric.
`System administrator 20 accesses administration software
`executing on control node 12 to logically define the hierar(cid:173)
`chical organization of distributed computing system 10. For
`example, system administrator 20 may provide organiza-
`tional data 21 to develop a model for the enterprise and
`
`Netflix, Inc. - Ex. 1001, Page 000034
`
`
`
`US 8,572,138 B2
`
`7
`logically define the enterprise fabric. System administrator
`20 may, for instance, develop a model for the enterprise that
`includes a number of domains, tiers, and node slots hierar(cid:173)
`chically arranged within a single enterprise fabric.
`More specifically, system administrator