`US 20120317568Al
`
`c19) United States
`c12) Patent Application Publication
`Aasheim
`
`c10) Pub. No.: US 2012/0317568 Al
`Dec. 13, 2012
`(43) Pub. Date:
`
`(54) OPERATING SYSTEM DECOUPLED
`HETEROGENEOUS COMPUTING
`
`(75)
`
`Inventor:
`
`Jered Aasheim, Bellevue, WA (US)
`
`(73) Assignee:
`
`MICROSOFT CORPORATION,
`Redmond, WA (US)
`
`(21) Appl. No.:
`
`13/155,387
`
`(22) Filed:
`
`Jun.8,2011
`
`Publication Classification
`
`(51)
`
`Int. Cl.
`G06F 9/455
`
`(2006.01)
`
`(52) U.S. Cl. ............................................................ 718/1
`
`(57)
`
`ABSTRACT
`
`A heterogeneous processing system is described herein that
`provides a software hypervisorto autonomously control oper(cid:173)
`ating system thread scheduling across big and little cores
`without the operating system's awareness or involvement to
`improve energy efficiency or meet other processing goals.
`The system presents a finite set ofvirtualized compute cores
`to the operating system to which the system schedules threads
`for execution. Subsequently, the hypervisor intelligently con(cid:173)
`trols the physical assignment and selection of which core(s)
`execute each thread to manage energy use or other processing
`requirements. By using a software hypervisor to abstract the
`underlying big and little computer architecture, the perfor(cid:173)
`mance and power operating differences between the cores
`remain opaque to the operating system. The inherent indirec(cid:173)
`tion also decouples the release of hardware with new capa(cid:173)
`bilities from the operating system release schedule.
`
`0
`
`430
`
`L2 Cache and SCU
`
`Coherent Interconnect
`
`Petitioner Samsung Ex-1023, 0001
`
`
`
`Patent Application Publication
`
`Dec. 13, 2012 Sheet 1 of 4
`
`US 2012/0317568 Al
`
`100
`
`Heterogeneous Processing System
`110
`120
`
`130
`
`140
`
`Central
`Processing
`Unit(s)
`
`Operating
`System
`Interface
`Component
`
`Virtual Core
`Mgmt.
`Component
`
`Policy
`Engine
`Component
`
`150
`
`160
`
`170
`
`180
`
`Policy Data
`Store
`
`Scheduling
`Component
`
`Capability
`Mgmt.
`Component
`
`Hardware
`Interface
`Component
`
`FIG.1
`
`Petitioner Samsung Ex-1023, 0002
`
`
`
`Patent Application Publication
`
`Dec. 13, 2012 Sheet 2 of 4
`
`US 2012/0317568 Al
`
`Initialize Virtual Cores
`
`Receive Startup Request
`
`210
`
`Enumerate Physical
`Cores
`
`Determine Core
`Capabilities
`
`220
`
`230
`
`Identify Operating System
`
`240
`
`Access Hypervisor Pol icy
`
`250
`
`Create Virtual Cores
`
`260
`
`Invoke Operating System
`
`270
`
`Done
`
`FIG. 2
`
`Petitioner Samsung Ex-1023, 0003
`
`
`
`Patent Application Publication
`
`Dec. 13, 2012 Sheet 3 of 4
`
`US 2012/0317568 Al
`
`Schedule Thread
`
`Receive Thread
`Scheduling Request
`
`Determine Thread
`Processing Needs
`
`310
`
`320
`
`Access Scheduling Policy
`
`330
`
`Select Physical
`Processing Core
`
`Manage Core Capability
`Differences
`
`Schedule Thread to
`Execute on Selected Core
`
`340
`
`350
`
`360
`
`Done
`
`FIG. 3
`
`Petitioner Samsung Ex-1023, 0004
`
`
`
`Patent Application Publication
`
`Dec. 13, 2012 Sheet 4 of 4
`
`US 2012/0317568 Al
`
`0
`
`~ '
`
`- Ull·:J~.···········• .... ] 420
`
`~
`
`L2. C~i}l~
`
`··.·.·.·.· .. ·.·.·.··.·.·.·.· .. ·.·.·.··.·.·.·.· .. ·.·.···
`
`'.
`
`430
`
`L2 Cache and SCU
`
`Coherent Interconnect
`
`FIG. 4
`
`Petitioner Samsung Ex-1023, 0005
`
`
`
`US 2012/0317568 Al
`
`Dec. 13, 2012
`
`1
`
`OPERATING SYSTEM DECOUPLED
`HETEROGENEOUS COMPUTING
`
`BACKGROUND
`
`[0001] Energy efficiency is increasingly becoming an
`important differentiator from mobile phones to datacenters.
`Customers are willing to pay a premium for longer lasting
`mobile device experiences but also are anxious to get increas(cid:173)
`ing performance from these same devices. On the other end of
`the scale, datacenters continue to scale up compute power but
`face thermal limits for what can be efficiently cooled. In
`addition, the public is increasingly more conscious of energy
`usage and environmental impact of energy use. Making effi(cid:173)
`cient use of energy is therefore a higher priority design goal in
`many types of computing systems.
`[0002] These technically opposing agendas----delivering
`more performance but using less power-have resulted in the
`industry experimenting with heterogeneous designs of "big"
`compute cores closely coupled with "little" compute cores
`within a single system or silicon chip, called heterogeneous
`cores or processing herein. The big cores are designed to offer
`high performance in a larger power envelope while the little
`cores are designed to offer lower performance in a smaller
`power envelope. The conventional wisdom is that an operat(cid:173)
`ing system's scheduler will then selectively schedule threads
`on the big or little cores depending upon the workload(s).
`During at least some times of the day, the operating system
`may be able to turn off the big core(s) entirely and rely on the
`power sipping little cores.
`[0003] Big and little cores may or may not share the same
`instruction set or features. For example, little cores may
`include a reduced instruction set or other differences that
`involve further decision making by the operating system to
`schedule processes on a compatible core. One traditional
`example is a system that includes a central processing unit
`(CPU) and graphics-processing unit (GPU) and allows the
`GPU to be used for computing tasks when it is idle or
`underutilized.
`[0004] Existing and present solutions depend on modifying
`the operating system's kernel in order to "enlighten" the
`operating system to the presence of big and little cores, their
`respective performance and power characteristics, and which
`facilities in the system ( e.g. CPU performance counters,
`cache miss/hit counters, bus activity counters, and so on) the
`operating system can monitor for determining on which core
`( s) to schedule a particular thread. This approach has several
`drawbacks: 1) it involves modifying the kernel for all sup(cid:173)
`ported operating systems, 2) it requires the modified kernel to
`understand differences in big/little designs across potentially
`different architectures ( e.g., supporting N different imple(cid:173)
`mentations), and 3) it tightly couples the release schedule of
`the operating system kernel and the underlying computer
`architecture. Changes to the computer architecture then
`involve waiting for the next scheduled operating system
`release (i.e., potentially several years or more) before the
`kernel can support new cores commercially ( or vice versa).
`
`SUMMARY
`
`[0005] A heterogeneous processing system is described
`herein that provides a software hypervisor to autonomously
`control operating system thread scheduling across big and
`little cores without the operating system's awareness or
`involvement to improve energy efficiency or meet other pro-
`
`cessing goals. The system presents a finite set ofvirtualized
`compute cores to the operating system to which the system
`schedules threads for execution. Subsequently, underneath
`the surface, the hypervisor intelligently controls the physical
`assignment and selection of which core(s)-big or little(cid:173)
`execute each thread to manage energy use or other processing
`requirements. By using a software hypervisor to abstract the
`underlying big and little computer architecture, the perfor(cid:173)
`mance and power operating differences between the cores
`remain opaque to the operating system. The inherent indirec(cid:173)
`tion also decouples the release of hardware with new capa(cid:173)
`bilities from the operating system release schedule. A hard(cid:173)
`ware vendor can release an updated hypervisor, and allow
`new hardware to work with any operating system version the
`vendor chooses.
`[0006] This Summary is provided to introduce a selection
`of concepts in a simplified form that are further described
`below in the Detailed Description. This Summary is not
`intended to identify key features or essential features of the
`claimed subject matter, nor is it intended to be used to limit
`the scope of the claimed subject matter.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`[0007] FIG. 1 is a block diagram that illustrates compo(cid:173)
`nents of the heterogeneous processing system, in one embodi(cid:173)
`ment.
`[0008] FIG. 2 is a flow diagram that illustrates processing
`of the heterogeneous processing system to initialize a com(cid:173)
`puting device with heterogeneous processing cores using a
`hypervisor between the cores and an operating system, in one
`embodiment.
`[0009] FIG. 3 is a flow diagram that illustrates processing
`of the heterogeneous processing system to schedule one or
`more operating system threads through a hypervisor that
`manages heterogeneous processing cores, in one embodi(cid:173)
`ment.
`[0010] FIG. 4 is a block diagram that illustrates an operat(cid:173)
`ing environment of the heterogeneous processing system, in
`one embodiment.
`
`DETAILED DESCRIPTION
`
`[0011] A heterogeneous processing system is described
`herein that provides a software hypervisor to autonomously
`control operating system thread scheduling across big and
`little cores without the operating system's awareness or
`involvement to improve energy efficiency or meet other pro(cid:173)
`cessing goals. The system presents a finite set ofvirtualized
`compute cores to the operating system to which the system
`schedules threads for execution. Subsequently, underneath
`the surface, the hypervisor intelligently controls the physical
`assignment and selection of which core(s)-big or little(cid:173)
`execute each thread to manage energy use or other processing
`requirements. By using a software hypervisor to abstract the
`underlying big and little computer architecture, the perfor(cid:173)
`mance and power operating differences between the cores
`remain opaque to the operating system. The inherent indirec(cid:173)
`tion also decouples the release of hardware with new capa(cid:173)
`bilities from the operating system release schedule. A hard(cid:173)
`ware vendor can release an updated hypervisor, and allow
`new hardware to work with any operating system version the
`vendor chooses.
`[0012] The hypervisor implementation is tightly coupled to
`the underlying computer architecture and uses the available
`
`Petitioner Samsung Ex-1023, 0006
`
`
`
`US 2012/0317568 Al
`
`Dec. 13, 2012
`
`2
`
`system feedback (e.g., CPU utilization, bus/ cache activity,
`and so forth) to autonomously assign the appropriate cores for
`the requested workloads. This approach allows the underly(cid:173)
`ing computer architecture to change frequently in coopera(cid:173)
`tion with the software hypervisor and decouple this evolution
`from the above operating system(s). The heterogeneous pro(cid:173)
`cessing system provides simple, course-grained power man(cid:173)
`agement without modifying the operating system kernel
`itself. Thus, the heterogeneous processing system allows for
`more rapid hardware innovation, and allows existing data(cid:173)
`center and other installations to benefit today from available
`heterogeneous processing hardware.
`[0013] Heterogeneous computing is an emerging field
`within the industry with the goal of optimizing the execution
`of workloads based on different types of computing cores
`(e.g., CPUs, GPUs, accelerators, and so on) available in the
`system. Optimization can be for performance, power, latency,
`or other goals. The heterogeneous processing system, while
`applicable to these more general cases, is also targetable at
`systems with cores that have identical functional equivalence
`but differing performance/power operating characteristics.
`Typically, these systems have one or more big cores and one
`or more little cores. The big cores typically have deep pipe(cid:173)
`lines, out-of-order execution, large caches, high clock speeds,
`and are manufactured using higher leakage processes ( e.g.
`40G). The little cores typically have shorter pipelines, smaller
`caches, lower clock speeds, various power levels, and are
`manufactured using low leakage processes (e.g. 40LP).
`[0014]
`In some embodiments, the big and little cores may
`have architecture equivalence, micro-architecture equiva(cid:173)
`lence, a global interrupt controller, coherency, and virtualiza(cid:173)
`tion. Architecture equivalence may include the same Instruc(cid:173)
`tion Set Architecture (ISA), Single Instruction Multiple Data
`(SIMD), Floating Point (FP), co-processor availability, and
`ISA extensions. Micro-architecture equivalence may include
`difference in performance but the same configurable features
`(e.g. cache line length). A global interrupt controller provides
`the ability to manage, handle, and forward interrupts to all
`cores. Coherency means all cores can access (cache) data
`from other cores with forwarding as needed. Virtualization is
`for switching/migrating workloads from/to cores.
`[0015]
`In some embodiments, the heterogeneous process(cid:173)
`ing system may be able to handle minor differences in cores.
`For example, a little core that does not support Streaming
`Single Instruction, Multiple Data (SIMD) Extensions (SSE)
`(now existing in four iterations, SSE!, SSE2, SSE3, and
`SSE4), may still handle other Intel x86-based software code.
`The hypervisor may detect unsupported instructions in the
`instruction stream, and wake up an appropriate core to which
`to assign such streams. Other instruction streams may operate
`faithfully on any core. In some cases, such as where only a
`handful of unsupported instructions are used, the hypervisor
`may include some level of emulation to emulate the unsup(cid:173)
`ported instructions on the available instruction set. For
`example, operations such as vector math can often be broken
`down and implemented at lower efficiency using standard
`math instructions.
`[0016] The software hypervisor installs itself during the
`device boot process prior to operating system (OS) initializa(cid:173)
`tion. After completing specified hardware configuration (i.e.,
`configuring memory, initializing the virtualization facilities,
`and so on), the hypervisor then configures the big and little
`processing cores installed in the computing device via policy.
`For example, if the device is a mobile phone, the policy could
`
`dictate that the hypervisor start the operating system with a
`minimal amount of performance available and optimize for
`battery life; the hypervisor would subsequently schedule
`operating system threads to one or more little cores. Alterna(cid:173)
`tively, if the device is a datacenter blade, the policy could
`dictate that the hypervisor start the operating system with the
`maximal amount of available performance and sacrifice
`energy efficiency; the hypervisor would subsequently sched(cid:173)
`ule operating system threads to the available big cores-as
`well as possibly the little cores depending on the available
`thermal budget. After completing initialization, the software
`hypervisor loads the operating system boot manager, which
`then loads the operating system.
`[0017] During runtime, the heterogeneous processing sys(cid:173)
`tem presents a virtualized set of cores to the operating system.
`The operating characteristics and differences between the
`cores are opaque to the operating system and managed pri(cid:173)
`vately by the software hypervisor based upon the defined
`operating policy. The operating policy may be set during
`system initialization or dynamically during runtime.
`[0018] The hypervisor uses the operating policy in con(cid:173)
`junction with available system facilities ( e.g. CPU perfor(cid:173)
`mance counters, cache miss/hit counters, bus activity
`counters, and so on) to determine to which cores to schedule
`the operating system threads. The hypervisor will use this
`information to understand CPU core utilization, trends over
`time, locality of information, and input/output (I/0) patterns.
`From this information, the hypervisor can dynamically and
`speculatively migrate the operating system threads across the
`big and little cores as appropriate. Additionally, the hypervi(cid:173)
`sor may also control dynamic frequency and voltage scaling
`(DFVS) on behalf of the operating system depending on the
`system implementation.
`[0019] Here is a sampling of available operating policies
`they hypervisor may control: Minimum Power (MiPo ), Maxi(cid:173)
`mum Performance (MaPe ), Minimal Power, Performance on
`Demand (MiPoD), and Maximum Performance, Power
`Down on Idle (MaPel). Each of these is described in the
`following paragraphs. However, additional, more advanced
`operating policies can be implemented as chosen by any
`particular implementation.
`[0020] Minimum Power (MiPo) schedules threads to the
`minimal set of cores. This typically will mean the hypervisor
`schedules threads to the little cores and uses DVFS as needed
`to control the power and performance operating point for the
`core. Additional little cores can be powered and scheduled as
`needed.
`[0021] Maximum Performance (MaPe) schedules threads
`to the maximal set of cores. This typically will mean the
`hypervisor schedules threads to all available cores-starting
`with the big cores-and use DVFS as needed to control the
`power and performance operating point for the cores. The
`little cores are also powered and scheduled as much is allowed
`by the available thermal budget.
`[0022] Minimal Power, Performance on Demand (MiPoD)
`normally operates at the lowest available power state ( e.g., on
`one or more little cores) but boosts performance as workloads
`demand. This is commonly referred to as a "turbo" or "boost"
`mode of operation and is enabled by dynamically allocating
`and scheduling to big cores. Once the workload is completed,
`the system returns to the minimal power state ( e.g. on a little
`core).
`[0023] Maximum Performance, Power Down on Idle
`(MaPel) normally operates at the maximal available perfor-
`
`Petitioner Samsung Ex-1023, 0007
`
`
`
`US 2012/0317568 Al
`
`Dec. 13, 2012
`
`3
`
`mance state ( e.g. on one or more big cores) but acquiesces to
`lower power states once an idle threshold is reached. The idle
`threshold in this case is not the typical near-zero CPU utili(cid:173)
`zation but can be arbitrarily defined at some Dhrystone Mil(cid:173)
`lion Instructions per Second (DMIPS) or CPU utilization
`percentage as defined by the policy. When going to idle, the
`hypervisor dynamically allocates and schedules to little cores
`and puts the unused big cores into standby/parked states.
`Policy and/or future workloads determine when the system
`returns to the maximum available performance state (e.g. on
`big cores).
`[0024] FIG. 1 is a block diagram that illustrates compo(cid:173)
`nents of the heterogeneous processing system, in one embodi(cid:173)
`ment. The system 100 includes one or more central process(cid:173)
`ing units 110, an operating system interface component 120,
`a virtual core management component 130, a policy engine
`component 140, a policy data store 150, a scheduling com(cid:173)
`ponent 160, a capability management component 170, and a
`hardware interface component 180. Each of these compo(cid:173)
`nents is described in further detail herein. The following
`components may be implemented within a software hypervi(cid:173)
`sor that sits between an operating system and hardware
`resources of a computing device.
`[0025] The one or more central processing units 110
`include one or more processing cores that have heterogeneous
`processing capabilities and power profiles. Typically, each
`CPU complex is located on a single silicon die and each core
`of a CPU complex shares a silicon die. Hardware can be
`implemented in a variety of packages for a variety of types of
`devices. For example, newer mobile devices and even some
`recent desktop processors include a CPU and GPU on the
`same chip for efficient communication between the two and
`lower power usage. Each CPU complex may include one or
`more big and little cores. Alternatively or additionally, one
`CPU complex may include all big cores while another CPU
`complex includes all little cores. CPU complexes as used here
`applies to GPU sand other hardware that can execute software
`instructions.
`[0026] The operating system interface component 120
`communicates between a hypervisor and an operating system
`to receive instructions for delivering to hardware resources
`and for receiving output from the hardware resources. The
`operating system may schedule threads, provide a pointer to
`an instruction stream ( e.g., a program counter (PC)), write to
`memory areas that pass instructions to hardware, and so forth.
`An operating system typically interacts directly with the
`hardware on a computing device. However, a hypervisor
`inserts a layer of indirection between the operating system
`and hardware for a variety of purposes. Often, hypervisors are
`used to provide virtualization so that multiple operating sys(cid:173)
`tems can be run contemporaneously on the same hardware. A
`hypervisor can also be used to present virtual hardware to the
`operating system that differs from the actual hardware
`installed in a computing device. In the case of the heteroge(cid:173)
`neous processing system 100, this can include making big and
`little cores appear the same to the operating system. The
`system 100 may even present a different number of cores to
`the operating system than actually exist in the device.
`[0027] The virtual core management component 130 man(cid:173)
`ages one or more virtual cores that the hypervisor presents to
`the operating system. A virtual core appears to the operating
`system as a CPU core, but may differ in characteristics from
`available physical hardware in a computing device. For
`example, the virtual cores may hide differences in processing
`
`or power capabilities from the operating system, so that an
`operating system not designed to work with heterogeneous
`big and little cores can operate in a manner for which the
`operating system was designed. In such cases, the hypervisor
`provides any specialized programming needed to leverage the
`heterogeneous computing environment, so that the operating
`system need not be modified.
`[0028] The policy engine component 140 manages one or
`more policies for scheduling operating system threads and
`presenting virtual cores to the operating system based on the
`available one or more central processing units. The policy
`engine component 140 may include hardcoded policies spe(cid:173)
`cific to a particular hypervisor implementation or may
`include administrator-configurable policies that can be modi(cid:173)
`fied to suit the particular installation goals. Policies may
`determine which cores are scheduled first, tradeoffs between
`power usage and processing goals, how cores are shut off and
`awoken to save power, how virtual cores are presented to the
`operating system, and so forth.
`[0029] The policy data store 150 stores the one or more
`policies in a storage facility accessible to the hypervisor at
`boot and execution times. The policy data store 150 may
`include one or more files, file systems, hard drives, databases,
`or other storage facilities for persisting data across execution
`sessions of the system 100. In some embodiments, the admin(cid:173)
`istrator performs a setup step that takes the system 100
`through a configuration phase to store an initial set of policies
`for use by the hypervisor.
`[0030] The scheduling component 160 schedules one or
`more instruction streams received as threads from the oper(cid:173)
`ating system to one or more of the central processing units
`installed in the computing device. The scheduling component
`receives a virtual core identification from the operating sys(cid:173)
`tem that identifies the virtual core to which the operating
`system requests to schedule the thread. The scheduling com(cid:173)
`ponent 160 examines the schedule request and determines a
`physical core on which to schedule the thread to execute. For
`example, the component 160 may determine if power or
`processing is more relevant for the thread, and schedule to an
`appropriate little or big core in response. In some cases, the
`component 160 may avoid scheduling threads to certain cores
`to allow those cores to be powered down to save power.
`[0031] The capability management component 170 option(cid:173)
`ally manages one or more differences between big and little
`processing cores. In some cases, the system 100 may only
`operate on processing units in which the big and little cores
`share the same capabilities, and the capability management
`component 170 is not needed. In other cases, the system 100
`handles minor or major differences between available pro(cid:173)
`cessing cores. For example, the system 100 may watch for
`instructions that are not supported by some cores and sched(cid:173)
`ule the corresponding threads on cores that do support those
`instructions. In more sophisticated implementations, the
`component 170 may virtualize or emulate big core capabili(cid:173)
`ties on little cores ( or vice versa) to satisfy a power or other
`profile goal.
`[0032] The hardware interface component 180 communi(cid:173)
`cates between the hypervisor and central processing units to
`schedule software instructions to run on available physical
`cores. The hardware interface component 180 may include
`real memory addresses or other facilities for accessing real
`hardware that are hidden from other components and in par(cid:173)
`ticular from the guest operating system(s) managed by the
`hypervisor.
`
`Petitioner Samsung Ex-1023, 0008
`
`
`
`US 2012/0317568 Al
`
`Dec. 13, 2012
`
`4
`
`[0033] The computing device on which the heterogeneous
`processing system is implemented may include a central pro(cid:173)
`cessing unit, memory, input devices (e.g., keyboard and
`pointing devices), output devices ( e.g., display devices), and
`storage devices ( e.g., disk drives or other non-volatile storage
`media). The memory and storage devices are computer-read(cid:173)
`able storage media that may be encoded with computer-ex(cid:173)
`ecutable instructions ( e.g., software) that implement or
`enable the system. In addition, the data structures and mes(cid:173)
`sage structures may be stored or transmitted via a data trans(cid:173)
`mission medium, such as a signal on a communication link.
`Various communication links may be used, such as the Inter(cid:173)
`net, a local area network, a wide area network, a point-to(cid:173)
`point dial-up connection, a cell phone network, and so on.
`[0034] Embodiments of the system may be implemented in
`various operating environments that include personal com(cid:173)
`puters, server computers, handheld or laptop devices, multi(cid:173)
`processor systems, microprocessor-based systems, program(cid:173)
`mable consumer electronics, digital cameras, network PCs,
`minicomputers, mainframe computers, distributed comput(cid:173)
`ing environments that include any of the above systems or
`devices, set top boxes, systems on a chip (SOCs), and so on.
`The computer systems may be cell phones, personal digital
`assistants, smart phones, personal computers, programmable
`consumer electronics, digital cameras, and so on.
`[0035] The system may be described in the general context
`of computer-executable instructions, such as program mod(cid:173)
`ules, executed by one or more computers or other devices.
`Generally, program modules include routines, programs,
`objects, components, data structures, and so on that perform
`particular tasks or implement particular abstract data types.
`Typically, the functionality of the program modules may be
`combined or distributed as desired in various embodiments.
`[0036] FIG. 2 is a flow diagram that illustrates processing
`of the heterogeneous processing system to initialize a com(cid:173)
`puting device with heterogeneous processing cores using a
`hypervisor between the cores and an operating system, in one
`embodiment.
`[0037] Beginning in block 210, the system receives a star(cid:173)
`tup request to initialize a computing device. For example, a
`basic input/output system (BIOS), extensible firmware inter(cid:173)
`face (EFI), boot loader, or other initial device software may
`load and invoke a hypervisor that implements the heteroge(cid:173)
`neous computing system. In some cases, the administrator
`will have previously performed an installation phase to install
`the hypervisor on the computing device, although the system
`can also support network boot and other non-installation sce(cid:173)
`narios commonly offered for computing devices.
`[0038] Continuing in block 220, the system enumerates
`two or more physical processing cores of the computing
`device. In some embodiments, at least two cores offer differ(cid:173)
`ent performance and power usage characteristics. However,
`the system may also be used where asymmetry is not present.
`For example, using a software hypervisor for power manage(cid:173)
`ment could still be applicable in scenarios where you have N
`physical CPUs on die but that only K can be operated based
`upon externalities such as: ambient temperature, form factor
`enclosure, cost of available power, etc. At boot, the hypervi(cid:173)
`sor can use this "policy" information to report a virtualized set
`of K cores to the operating system and this could vary upon
`each boot cycle. The hypervisor would be performing the
`same task in this scenario for symmetric cores. The system
`may invoke the BIOS or other underlying layer to determine
`how many and what kind of processors the computing device
`
`has installed, and may execute a CPUID or other similar
`instruction to determine information about the processing
`capabilities of the processors. In some embodiments, the
`system may include an extensibility interface through which
`drivers or other hypervisor extensions can be implemented
`and added by the hypervisor manufacturer or a third party to
`add support for new processing hardware to the hypervisor,
`without necessarily updating the hypervisor itself.
`[0039] Continuing in block 230, the system determines
`capabilities of each enumerated processing core. The capa(cid:173)
`bilities may include one or more power profiles offered by
`each core, one or more instruction sets supported by each
`core, performance characteristics of each core, and so forth.
`The system may leverage informational interfaces (such as
`the previously mentioned CPUID instruction) of the core
`itself or information provided by a driver or other extension to
`the hypervisor, to determine each core's capabilities. The
`system uses the determined capabilities to assign threads to
`each core that are compatible with the core, and to perform
`scheduling in a manner consistent with received policies and
`processing goals.
`[0040] Continuing in block 240, the system identifies one
`or more operating systems for which the hypervisor will
`manage access and scheduling for the enumerated physical
`cores. The system may access a hard drive, flash drive, or
`other storage of the computing device to determine which
`operating system to invoke after the hypervisor is initialized.
`The hypervisor may be designed with information about vari(cid:173)
`ous operating systems, and may include extensibility so that
`new operating systems can be supported without updating the
`hypervisor itself. Each operating system and operating sys(cid:173)
`tem version may have different scheduling semantics or other
`nuances that the hypervisor handles to allow the operating
`system to execute correctly on virtualized processing
`resources. In some cases, the hypervisor may be requested to
`allow multiple operating systems to share the enumerated
`physical processing cores, and policy may dictate how that
`sharing is handled.
`[0041] Continuing in block 250, the system accesses hyper(cid:173)
`visor policy information that specifies one or more goals for
`scheduling operating system threads on the enumerated
`physical processing cores. The goals may include perfor(cid:173)
`mance goals, power usage goals, or other directions for deter(cid:173)
`mining which core or cores on which to execute operating
`system threads. The policy may be stored in a storage device
`associated with the computing device, hardcoded into a
`hypervisor implementation, and so forth. The hypervisor may
`receive updates to the policy through an administrative inter(cid:173)
`face provided to administrators.
`[0042] Continuing in block 260, the system creates one or
`more virtual cores to expose to the identified operating sys(cid:173)
`tem, wherein each virtual core isolates the operating system
`from determined differences in capabilities among the physi(cid:173)
`cal processing cores. For example, the heterogeneo