`
`(12) United States Patent
`Thomson et al.
`
`(10) Patent No.:
`(45) Date of Patent:
`
`US 9,086,883 B2
`Jul. 21, 2015
`
`(54) SYSTEMANDAPPARATUS FOR
`CONSOLIDATED DYNAMIC
`FREQUENCY/VOLTAGE CONTROL
`
`(56)
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`
`(75) Inventors: Steven S.Thomson, San Diego, CA
`(US); Mriganka Mondal, San Diego,
`CA (US); Nishant Hariharan, San
`Diego, CA (US)
`(73) Assignee: also is incorporated. San
`1egO,
`
`(*) Notice:
`
`Subject to any disclaimer, the term of this
`patent is extended or adjusted under 35
`U.S.C. 154(b) by 660 days.
`(21) Appl. No.: 13/344,146
`
`(22) Filed:
`
`Jan. 5, 2012
`
`5, 2005 Buch
`6,901,522 B2
`7,698,575 B2 * 4/2010 Samson ........................ T13,300
`8,010,822 B2
`8/2011 Marshall et al.
`8,024.590 B2
`9/2011 Song et al.
`(Continued)
`FOREIGN PATENT DOCUMENTS
`1557743 A2
`7/2005
`
`EP
`
`OTHER PUBLICATIONS
`International Search Report and Written Opinion—PCT/US2012/
`039456 ISA/EPO Mar. 4, 2013.
`(Continued)
`
`(65)
`
`Prior Publication Data
`US 2013/OOO7413 A1
`Jan. 3, 2013
`
`Primary Examiner — Ji H Bae
`(74) Attorney, Agent, or Firm — Nicholas A. Cole
`
`Related U.S. Application Data
`(60) Eyinal application No. 61/495,861, filed on Jun.
`s
`(51) Int. Cl
`Go,F i/32
`G06F I/00
`G06F L/26
`G06F 15/16
`HO4L 29/08
`GO6F 9/38
`(52) U.S. Cl
`CPC
`
`(2006.01)
`(2006.015
`(2006.015
`(200 6,015
`(2006.01)
`(2006.01)
`
`ABSTRACT
`(57)
`Methods and apparatus for accomplishing dynamic fre
`quency/voltage control between at least two processor cores
`in a multi-processor device or system include receiving busy,
`idle and wait, time and/or frequency information from a first
`processor core and receiving busy, idle, wait, time and/or
`frequency information from a second processor core. The
`received busy, idle, wait, time and/or frequency information
`mav be correlated to identify patterns of interdependence.
`y
`nuly p
`p
`The correlated information may be used to determine
`dynamic frequency/voltage control settings for the first and
`second processor cores to provide a performance level that
`G06F I/3296 (2013.01); G06F 1/324
`15% cost 43,339,8. sor cores. The correlation of received busy, idle, wait, time
`HO4L 67/I 6(2 013.01); Y02B 60Vi 2 17
`and/or frequency information may involve generating a con
`(2013.01): Yo2B 60.1285 (2013.01)
`solidated busy/idle pulse train that can then be used to set the
`•
`u. fs
`frequency or Voltage of each processor core independently.
`(58) Field of Classification Search
`None
`See application file for complete search history.
`
`- - - - - - - - - - - - - -
`
`•
`
`u. fs
`
`accommodates interdependent processes, threads and proces
`
`40 Claims, 11 Drawing Sheets
`
`400
`
`Software (User Space)
`
`406--
`
`416 -
`
`Consolidated DCWS Control
`Module
`
`Software (Kernel)
`
`404's
`
`
`
`ldis Stats
`Dewice
`Ener
`40s
`
`414
`
`Deferred
`Timer river
`
`412 CPUR
`
`CPU 0
`lie Stats
`
`CPui 2D gPU2D GPU 3D GPU 0
`lle Stats
`drier
`driver
`driwer
`
`48
`
`CPUGPU
`Freq Hotplug
`
`Hardwars
`
`402--
`
`CPU 0
`
`CPU 1
`
`2 3PO 2 GPU 1 || 3 GPUO
`
`420 t
`
`Clocks, PMC, SPMs
`
`K
`
`Patent Owner Daedalus Prime LLC
`Exhibit 2005 - Page 1 of 26
`
`
`
`US 9,086,883 B2
`Page 2
`
`(56)
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`
`8,069,446 B2 11/2011 Zhao et al.
`8,631,411 B1
`1/2014 Ghose
`2005/0022038 A1
`1/2005 Kaushik et al. ............... T13,300
`2006/0026447 A1
`2/2006 Naveh et al. .................. T13,322
`2007/030O227 A1 12, 2007 Mall et al.
`38885. A.
`558 ley et al.
`2009/0249094 A1* 10, 2009 Marshall et al. .............. T13,320
`2009/0309885 A1 12/2009 Samson et al.
`2009/0328.055 Al
`12/2009 Bose et al.
`2010.0153761 A1
`6, 2010 Nishioka
`2010/0241884 A1
`9/2010 Barsness et al.
`2010/02995.41 A1 11/2010 Ishikawa et al.
`2011/0022871 A1
`1/2011 Bouvier et al.
`2011, 0078469 A1
`3/2011 Therien
`2011/0087909 A1 * 4/2011 Kanakogi ..................... T13,322
`2011 OO88041 A1
`4/2011 Alameldeen et al.
`2011/0113270 A1
`5, 2011 Carter et al.
`2011 0145605 A1
`6, 2011 Sur et al.
`2011 0145615 A1
`6/2011 Rychlik et al.
`2011 0145624 A1
`6/2011 Rychlik et al.
`2011 0145824 A1
`6/2011 Thomson et al.
`2011/O173617 A1
`7/2011 Gargash et al.
`
`2011/019 1607 A1* 8, 2011 Gunther et al. ............... T13,300
`2011 0191783 A1
`8, 2011 Le Moal
`2011/0225590 A1
`9/2011 Thomson et al.
`2011/0296212 A1 12/2011 Elinozahy et al.
`2012/0131309 A1
`5/2012 Johnson et al.
`2013/0060555 A1
`3/2013 Thomson et al. ............... TO3/21
`2013/02389 12 A1* 9, 2013 Priel et al. ..................... T13,300
`
`OTHER PUBLICATIONS
`Langen, P. et al., "Leakage-Aware Multiprocessor Scheduling",
`Journal of Signal Processing Systems; for Signal, Image, and Video
`Technology (Formerly The Journal of VLSI Signal Processing Sys
`tems for Signal, Image, and Video Technology), Springer US, Bos
`ton, vol. 57, No. 1, May 20, 2008, pp. 73-88, XPO19734466,
`ISSN:1939-8115 abstract
`h O004
`o
`abstract paragraph (0004).
`Liu, H., et al., "Combining Coarse-Grained Software Pipelining with
`DVS for Scheduling Real-Time Periodic Dependent Tasks on Multi
`Core Embedded Systems”, Journal of Signal Processing Systems; for
`Signal, Image, andVideo Technology (Formerly The Journal of VLSI
`Signal Processing Systems for Signal, Image, and Video Technol
`ogy), Springer US, Boston, vol. 57, No. 2, Nov. 26, 2008, pp. 249
`262, XPO19734482, ISSN:1939-81 15 abstract paragraph 0001–
`paragraph 0004.
`
`* cited by examiner
`
`Patent Owner Daedalus Prime LLC
`Exhibit 2005 - Page 2 of 26
`
`
`
`U.S. Patent
`
`Jul. 21, 2015
`
`Sheet 1 of 11
`
`US 9,086,883 B2
`
`104
`
`
`
`106
`
`108
`
`110
`
`Modem
`Processor
`
`Graphics
`Processor
`
`Applications
`Processor
`
`COOrOCeSSOr
`O
`
`InterCOnnection/BuS
`
`Digital Signal
`Processor
`
`Analog and
`Custom
`Circuitry
`
`System
`Components
`and
`ReSOurCes
`
`Voltage
`Regulator
`
`FIG. 1
`
`Patent Owner Daedalus Prime LLC
`Exhibit 2005 - Page 3 of 26
`
`
`
`U.S. Patent
`
`Jul. 21, 2015
`
`Sheet 2 of 11
`
`US 9,086,883 B2
`
`Processing Unit
`
`Processing Unit
`
`L1 Cache
`
`L1 Cache
`
`L2 Cache
`
`Processing Unit
`
`Processing Unit
`
`L2 Cache
`
`L2 Cache
`
`BuS/InterConnects
`
`Main Memory
`
`Input/Output
`
`
`
`External Memory
`| Hard Disk
`
`FIG. 2
`
`Patent Owner Daedalus Prime LLC
`Exhibit 2005 - Page 4 of 26
`
`
`
`U.S. Patent
`
`Jul. 21, 2015
`
`Sheet 3 of 11
`
`US 9,086,883 B2
`
`
`
`SMEMI
`
`Fixed
`Function
`
`FIG. 3
`
`Patent Owner Daedalus Prime LLC
`Exhibit 2005 - Page 5 of 26
`
`
`
`U.S. Patent
`
`Jul. 21, 2015
`
`Sheet 4 of 11
`
`US 9,086,883 B2
`
`
`
`406
`
`Software (User Space)
`
`Consolidated DCVS Control
`Module
`
`400
`-
`
`404
`
`402
`
`ldle Stats
`
`ldle Stats
`
`2D GPU 0 2D GPU 1 3D GPU 0
`Driver
`Driver
`Driver
`
`CPU/GPU
`Freq. Hotplug
`
`CPU O
`
`CPU 1
`
`2D GPU O 2D GPU 1 3D GPU O
`
`Clocks, PMIC, SPMs
`
`FIG. 4
`
`Patent Owner Daedalus Prime LLC
`Exhibit 2005 - Page 6 of 26
`
`
`
`U.S. Patent
`
`Jul. 21, 2015
`
`Sheet 5 of 11
`
`US 9,086,883 B2
`
`
`
`Receive busy/idle/wait
`information from a first
`processing Core
`
`500
`-
`
`Receive busy?idle/wait
`information from a Second - 504
`processing Core
`
`Deliver data to COnSOlidated
`DCVS algorithm
`
`Correlate the idle?busy/wait
`periods and I/O periods
`aCrOSS the COres
`
`Determine performance
`objectives for system as a
`whole and determine
`frequency/voltage Settings
`for power Conservation
`
`PrOCeSS data
`
`u-514
`
`FIG. 5
`
`Patent Owner Daedalus Prime LLC
`Exhibit 2005 - Page 7 of 26
`
`
`
`U.S. Patent
`
`Jul. 21, 2015
`
`Sheet 6 of 11
`
`US 9,086,883 B2
`
`605
`
`
`
`User Space
`DCVS Driver
`
`635
`
`PO. On FD'S
`
`640
`
`Read Stat data from
`all Of the devices
`
`645
`
`Update the Optimum
`Performance Level
`
`FIG. 6
`
`625
`
`620
`
`615
`
`Patent Owner Daedalus Prime LLC
`Exhibit 2005 - Page 8 of 26
`
`
`
`U.S. Patent
`
`Jul. 21, 2015
`
`Sheet 7 of 11
`
`US 9,086,883 B2
`
`CPUO Busy
`
` | /// | || /// | /// | /// | /// | ||
`
`///
` /// < /// | || /// < /// |
`
`/// < ///
`NNN
`NONN
`
`CPU1 Busy
`
`NOEN NA
`
`50%
`Busy
`
`33%
`Busy
`
`17%
`Busy
`
`
`
`100%
`Busy
`
`
`
`Consolidated
`CPUO Busy
`
`º? ///
`
`1 ; 1 ; 11 ||
`1:1 ||
`LIITITLº
`
`COnSolidated
`CPU1 Busy
`
`///
`
`///
`|||||
`
`///
`|||||
`
`///
`|||||
`
`///
`
`///
`|||||
`
`100%
`Busy
`
`Consolidated
`GPU Busy
`
`N
`
`N
`
`N
`
`N
`
`N
`
`N
`
`100%
`Busy
`
`N
`
`Busy
`
`ldle
`
`Time
`
`Wait
`
`2
`E
`
`Discarded
`
`FIG. 7A
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`Patent Owner Daedalus Prime LLC
`Exhibit 2005 - Page 9 of 26
`
`
`
`U.S. Patent
`
`Jul. 21, 2015
`
`Sheet 8 of 11
`
`US 9,086,883 B2
`
`CPUO Busy
`
`CPU1 Busy
`
`
`
`COnSOlidated
`CPUO Busy
`
`COnSOlidated
`CPU1 Busy
`
`Consolidated
`
`17%
`Busy
`
`33%
`Busy
`
`17%
`Busy
`
`17%
`Busy
`
`40%
`Busy
`
`25%
`Busy
`
`Time
`
`N Busy 2 Wait
`ldle E DiSCarded
`
`FIG. 7B
`
`Patent Owner Daedalus Prime LLC
`Exhibit 2005 - Page 10 of 26
`
`
`
`U.S. Patent
`
`Jul. 21, 2015
`
`Sheet 9 of 11
`
`US 9,086,883 B2
`
`800
`-
`
`
`
`Receive indication of need
`to evaluate frequency/
`voltagesettings in a first
`processor Core
`
`805
`
`Receive a measure of busy,
`idle and wait periods on the
`processor Core
`
`810
`
`Receive a measure of busy,
`idle and wait periods for a
`second processor core for
`the Same intervals
`
`Correlate the busy, idle and
`wait periods of two or more
`processor Cores
`
`Determine an appropriate
`frequency/voltage setting
`based On the COrrelated
`busy/idle/wait periods of the
`two or more processor Cores
`
`815
`
`820
`
`825
`
`Implement the determined
`frequency/voltage setting in
`each of the two processor
`COreS
`
`830
`
`FIG. 8
`
`Patent Owner Daedalus Prime LLC
`Exhibit 2005 - Page 11 of 26
`
`
`
`U.S. Patent
`
`Jul. 21, 2015
`
`Sheet 10 of 11
`
`US 9,086,883 B2
`
`
`
`900
`
`1011 1021 1031
`
`Patent Owner Daedalus Prime LLC
`Exhibit 2005 - Page 12 of 26
`
`
`
`U.S. Patent
`
`Jul. 21, 2015
`
`Sheet 11 of 11
`
`US 9,086,883 B2
`
`
`
`FIG. 11
`
`Patent Owner Daedalus Prime LLC
`Exhibit 2005 - Page 13 of 26
`
`
`
`US 9,086,883 B2
`
`1.
`SYSTEMAND APPARATUS FOR
`CONSOLIDATED DYNAMIC
`FREQUENCY/VOLTAGE CONTROL
`
`RELATED APPLICATIONS
`
`This application claims the benefit of priority to U.S. Pro
`visional Application No. 61/495,861, entitled “System and
`Apparatus for Consolidated Dynamic Frequency/Voltage
`Control’ filed Jun. 10, 2011, the entire contents of which are
`hereby incorporated by reference.
`
`10
`
`BACKGROUND
`
`2
`frequency/voltage of an individual core only when there is no
`identifiable correlation between the processor operations.
`Various aspects correlate the workloads (e.g., busy versus idle
`states) of two or more processor cores, and may scale the
`frequency/voltage of the cores to a level consistent with the
`correlated processes such that the processing performance is
`maintained and maximum energy efficiency is achieved. In
`various aspects, the method may further include receiving an
`input/output activity signal from one of the first and the sec
`ond processor cores, and using the received input/output
`activity signal in determining the consolidated dynamic fre
`quency/voltage control for the first and the second processor
`COCS.
`The various aspects include methods of performing
`dynamic clock and/or Voltage Scaling on a multiprocessor
`system having two or more processor cores, which may
`include receiving a first set of information from a first pro
`cessor core, the first information set including information
`regarding at least one of a frequency, time, busy periods, idle
`periods, and wait periods of the first processor core, receiving
`a second set of information from a second processor core, the
`second information set including information regarding at
`least one of a frequency, time, busy periods, idle periods, and
`wait periods of the second processor core, correlating the first
`and second information sets to identify an interdependence
`relationship between the operations of the first processor
`cores and the operations of the second processor cores, and
`Scaling the frequency and/or the Voltage of the first and sec
`ond cores according to a correlated information set when an
`interdependence relationship is identified between the opera
`tions of the first processor core and the operations of the
`second processor core is identified. In an aspect, the method
`may further include Scaling the frequency or Voltage of the
`first and second cores independently when no interdepen
`dence relationship is identified between the operations of the
`first processor core and the operations of the second processor
`core, or any number of the processor cores. In an aspect, the
`method may further include synchronizing the first and sec
`ond information sets, as well as any number of received
`information sets. In a further aspect, operations of correlating
`information sets to identify a relationship between the opera
`tions of the first processor core and the operations of the
`second processor core may include identifying a relationship
`in which the first processor core is busy when the second
`processor core is idle. In this aspect, the method may further
`include subtracting a busy time value associated with the first
`core from an idle time value associated with the second core.
`In a further aspect, correlating the synchronized first and
`second information sets to identify a relationship between the
`operations of the first processor core and the operations of the
`second processor core may include identifying a relationship
`in which the first processor core is busy when the second
`processor core is idle. In this aspect, the method may further
`include Subtracting a busy time value associated with the
`second core from an idle time value associated with the first
`core. In a further aspect, correlating the synchronized first and
`second information sets to identify a relationship between the
`operations of the first processor core and the operations of the
`second processor core may include overlaying a first busy
`pulse train on a second busy pulse train. Inafurther aspect, the
`first and second information sets may include pulse trains
`selected from one of a busy pulse train, an idle pulse train, and
`a wait pulse train, and synchronizing the first and second
`information sets may include synchronizing a first pulse train
`with a second pulse train. In a further aspect, a single thread
`executing on the multiprocessor System may perform the
`dynamic clock and Voltage scaling operations. In a further
`
`15
`
`30
`
`35
`
`Cellular and wireless communication technologies have
`seen explosive growth over the past several years. This
`growth has been fueled by better communications, hardware,
`larger networks, and more reliable protocols. Wireless service
`providers are now able to offer their customers an ever-ex
`panding array of features and services, and provide users with
`unprecedented levels of access to information, resources, and
`communications. To keep pace with these service enhance
`ments, mobile electronic devices (e.g., cellular phones, tab
`lets, laptops, etc.) have become more powerful and complex
`than ever. For example, mobile electronic devices now com
`25
`monly include system-on-chips (SoCs) and/or multiple
`microprocessor cores embedded on a single Substrate, allow
`ing mobile device users to execute complex and power inten
`sive Software applications on their mobile devices. As a result,
`a mobile device's battery life and power consumption char
`acteristics are becoming ever more important considerations
`for consumers of mobile devices.
`Methods for improving the battery life of multiprocessor
`devices generally involve reducing the amount of energy
`consumed by reducing the Voltage applied to the processors/
`cores when they are idle or lightly loaded. Reducing the
`Voltage applied to processors/core necessarily involves
`reducing the frequency at which the processors operate. Such
`reductions in frequency and Voltage may be accomplished by
`Scaling the Voltage/frequency using dynamic clock and Volt
`age/frequency Scaling (DCVS) schemes/processes.
`Generally, DCVS schemes/processes monitor the propor
`tion of the time that the processor core is idle compared to the
`time it is busy to determine how the frequency and Voltage
`should be adjusted to provide power-efficient operation. For
`example, the busy and idle periods may be reviewed, and a
`decision may be made regarding the most energy efficient
`performance of the processor, in real time or "on the fly.”
`However, existing DCVS solutions for multicore processors
`require that each processing core include a DCVS module/
`process and/or adjust the processors frequency/voltage inde
`pendent of other cores. Conventional DCVS solutions exhibit
`a number of performance problems, and implementing an
`effective DCVS method that correctly scales frequency/volt
`age for each core of multicore processor system is an impor
`tant and challenging design criterion.
`
`40
`
`45
`
`50
`
`55
`
`SUMMARY
`
`The various aspects include methods for correlating
`dynamic frequency and/or Voltage control between at least
`two processor cores that determines a frequency performance
`level for the two or more processor cores which accommo
`dates processes involving interactions between the processor
`cores. The various aspects evaluate the performance of each
`processor core to determine if there exists a correlation
`between the operations of two or more cores, and scale the
`
`60
`
`65
`
`Patent Owner Daedalus Prime LLC
`Exhibit 2005 - Page 14 of 26
`
`
`
`3
`aspect, correlating the synchronized first and second infor
`mation sets to identify a relationship between the operations
`of the first processor core and the operations of the second
`processor core may include producing a consolidated pulse
`train for each of the first and the second processing cores. In
`a further aspect, correlating the synchronized first and second
`information sets to identify a relationship between the opera
`tions of the first processor core and the operations of the
`second processor core may further include using the consoli
`dated pulse train for each of the first and the second process
`ing cores to determine a performance level of each of the first
`and second processing cores independently. In further
`aspects, the operations described above may be accomplished
`for any number of processor cores which may be in a com
`puting device, including receiving any number information
`15
`sets and correlating some or all of the information sets may be
`correlated to identify relationships among the cores.
`Further aspects include a computing device having
`memory and two or more processor cores coupled to the
`memory, wherein at least one of the processor cores is con
`figured with processor-executable instructions to cause the
`computing device to perform operations of the aspect meth
`ods for performing dynamic clock and/or Voltage scaling on a
`multiprocessor System. In an aspect, the at least one of the
`processor cores may be configured with processor-executable
`instructions to cause the computing device to perform opera
`tions that may include receiving a first set of information from
`a first processor core, the first information set including infor
`mation regarding at least one of a frequency, time, busy peri
`ods, idle periods, and wait periods of the first processor core,
`receiving a second set of information from a second processor
`core, the second information set including information
`regarding at least one of a frequency, time, busy periods, idle
`periods, and wait periods of the second processor core, cor
`relating the first and second information sets to identify an
`interdependence relationship between the operations of the
`first processor cores and the operations of the second proces
`Sor cores, and Scaling the frequency or Voltage of the first and
`second cores according to a correlated information set when
`an interdependence relationship is identified between the
`40
`operations of the first processor core and the operations of the
`second processor core is identified. In an aspect, at least one
`of the processor cores may be configured with processor
`executable instructions to cause the computing device to per
`form operations that include Scaling the frequency or Voltage
`of the first and second cores independently when no interde
`pendence relationship is identified between the operations of
`the first processor core and the operations of the second
`processor core. In an aspect, at least one of the processor cores
`may be configured with processor-executable instructions to
`cause the computing device to perform operations that
`include synchronizing the first and second information sets.
`In a further aspect, at least one of the processor cores may
`be configured with processor-executable instructions to cause
`the computing device to perform operations such that corre
`lating the synchronized first and second information sets to
`identify a relationship between the operations of the first
`processor core and the operations of the second processor
`core includes identifying a relationship in which the first
`processor core is busy when the second processor core is idle.
`In this aspect, the at least one of the processor cores may be
`configured with processor-executable instructions to cause
`the computing device to perform operations that include Sub
`tracting a busy time value associated with the first core from
`an idle time value associated with the second core.
`In a further aspect, at least one of the processor cores may
`be configured with processor-executable instructions to cause
`
`25
`
`30
`
`35
`
`45
`
`50
`
`55
`
`60
`
`65
`
`US 9,086,883 B2
`
`10
`
`4
`the computing device to perform operations such that corre
`lating the synchronized first and second information sets to
`identify a relationship between the operations of the first
`processor core and the operations of the second processor
`core includes identifying a relationship in which the first
`processor core is busy when the second processor core is idle.
`In this aspect, at least one of the processor cores may be
`configured with processor-executable instructions to cause
`the computing device to perform operations that include Sub
`tracting a busy time value associated with the second core
`from an idle time value associated with the first core.
`In a further aspect, at least one of the processor cores may
`be configured with processor-executable instructions to cause
`the computing device to perform operations such that corre
`lating the synchronized first and second information sets to
`identify a relationship between the operations of the first
`processor core and the operations of the second processor
`core includes overlaying a first busy pulse train on a second
`busy pulse train. In a further aspect, at least one of the pro
`cessor cores may be configured with processor-executable
`instructions to cause the computing device to perform opera
`tions such that the first and second information sets include
`pulse trains selected from one of a busy pulse train, an idle
`pulse train, and a wait pulse train, and synchronizing the first
`and second information sets includes synchronizing a first
`pulse train with a second pulse train. In a further aspect, at
`least one of the processor cores may be configured with
`processor-executable instructions to cause the computing
`device to perform operations that include a single thread
`executing on the multiprocessor system performs the
`dynamic clock and Voltage Scaling operations.
`In a further aspect, at least one of the processor cores may
`be configured with processor-executable instructions to cause
`the computing device to perform operations such that corre
`lating the synchronized first and second information sets to
`identify a relationship between the operations of the first
`processor core and the operations of the second processor
`core includes producing a consolidated pulse train for each of
`the first and the second processing cores. In a further aspect,
`at least one of the processor cores may be configured with
`processor-executable instructions to cause the computing
`device to perform operations such that correlating the Syn
`chronized first and second information sets to identify a rela
`tionship between the operations of the first processor core and
`the operations of the second processor core further includes
`using the consolidated pulse train for each of the first and the
`second processing cores to determine a performance level of
`each of the first and second processing cores independently.
`Further aspects include a non-transitory processor-read
`able storage medium having stored thereon processor-execut
`able instructions configured to cause at least one processor
`core of a multi-processor System to perform operations of the
`aspect methods for performing dynamic clock and/or Voltage
`Scaling. Further aspects include a computing device having
`various means for performing functions of the aspect methods
`for performing dynamic clock and/or Voltage scaling on a
`multiprocessor system.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`
`The accompanying drawings, which are incorporated
`herein and constitute part of this specification, illustrate
`exemplary aspects of the invention, and together with the
`general description given above and the detailed description
`given below, serve to explain the features of the invention.
`FIG. 1 is an architectural diagram of an example system on
`chip Suitable for implementing the various aspects.
`
`Patent Owner Daedalus Prime LLC
`Exhibit 2005 - Page 15 of 26
`
`
`
`US 9,086,883 B2
`
`5
`FIG. 2 is an architectural diagram of an example multicore
`processor Suitable for implementing the various aspects.
`FIG. 3 is a block diagram of a controller having multiple
`cores Suitable for use in an aspect.
`FIG. 4 is a process flow diagram of an aspect method for
`correlating idle and busy periods across processing cores to
`determine performance objectives for a system.
`FIG. 5 is a communication flow diagram illustrating com
`munications and processes among a driver and a number of
`processing cores for using pulse trains to set performance
`levels for each processor core according to an aspect.
`FIG. 6 illustrates processor pulse trains showing alternat
`ing busy and idle periods for processor cores along a common
`time reference.
`FIGS. 7A-B illustrate processor pulse trains of busy, idle,
`and wait periods along a common time reference.
`FIG. 8 is a process flow diagram of an aspect method
`implementable on any of a plurality of processor cores for
`determining appropriate frequency/voltage settings for two
`or more processor cores based on the correlated busy and idle
`periods of two or more processor cores.
`FIG. 9 is a component block diagram of a mobile device
`Suitable for use in an aspect.
`FIG. 10 is a component block diagram of a server device
`Suitable for use in an aspect.
`FIG.11 is a component block diagram of a laptop computer
`device Suitable for use in an aspect.
`
`5
`
`10
`
`15
`
`25
`
`6
`runs). This separation is of particular importance in Android
`and other general public license (GPL) environments where
`code that is part of the kernel space must be GPL licensed,
`while code running in user-space doesn’t need to be GPL
`licensed.
`The term “system on chip (SOC) is used herein to refer to
`a single integrated circuit (IC) chip that contains multiple
`resources and/or processors integrated on a single Substrate.
`A single SOC may contain circuitry for digital, analog,
`mixed-signal, and radio-frequency functions. A single SOC
`may also include any number of general purpose and/or spe
`cialized processors (DSP, modem processors, video proces
`sors, etc.), memory blocks (e.g., ROM, RAM, Flash, etc.),
`and resources (e.g., timers, Voltage regulators, oscillators,
`etc.). SOCs may also include software for controlling the
`integrated resources and processors, as well as for controlling
`peripheral devices.
`The term “multicore processor is used herein to refer to a
`single integrated circuit (IC) chip or chip package that con
`tains two or more independent processing cores (e.g., CPU
`cores) configured to read and execute program instructions. A
`SOC may include multiple multicore processors, and each
`processor in an SOC may be referred to as a core.
`The term “resource' is used herein to refer to any of a wide
`variety of circuits (e.g., ports, clocks, buses, oscillators, etc.),
`components (e.g., memory), signals (e.g., clock signals), and
`Voltages (e.g., Voltage rails) which are used to Support pro
`cessors and clients running on a computing device.
`Generally, the dynamic power (Switching power) dissi
`pated by a chip is C*V*f, where C is the capacitance being
`Switched per clock cycle, V is Voltage, and fis the Switching
`frequency. Thus, as frequency changes, the dynamic power
`will change linearly with it. Dynamic power may account for
`approximately two-thirds of the total chip power. Dynamic
`Voltage scaling may be accomplished in conjunction with
`frequency scaling, as the frequency that a chip runs at may be
`related to the operating Voltage. The efficiency of some elec
`trical components, such as Voltage regulators, may decrease
`with increasing temperature Such that the power used
`increases with temperature. Since increasing power use may
`increase the temperature, increases in Voltage or frequency
`may increase system power demands even further.
`Dynamic scaling of Voltage and frequency has previously
`been accomplished by Voltage scaling/frequency scaling
`dynamic clock and Voltage scaling (DCVS) mechanisms
`implemented within each processing core. Generally, each
`processing core DCVS functions to adjust its frequency/volt
`age independent of other processor cores within the multi
`processor and/or computing device. However, this can
`present performance issues when two or more processor
`cores are processing threads alternatively. This may occur
`when a single thread is processed by a first processor core
`then by a second processor and then again by the first proces
`Sor core. This may also occur when multiple threads are
`processing on respective processor cores and the results of
`one thread in one processor core trigger operations of another
`thread in a second processor core. In these situations, each
`processor core may alternatively enter idle states while it
`awaits the results of processing in the other processor core. If
`each processor core DCVS considers only the busy and idle
`conditions of its own core, this interdependency of two or
`more processor cores will not be considered by conventional
`DCVS methods. As a result, one or more of the processor
`cores may shift to a lower frequency/voltage State to conserve
`power because the processor core is idle a significant portion
`of the time. The slower a processor operates (i.e., the lower its
`operating frequency), the more energy efficient it becomes,
`
`DETAILED DESCRIPTION
`
`30
`
`35
`
`The various aspects will be described in detail with refer
`ence to the accompanying drawings. Wherever possible, the
`same reference numbers will be used throughout the draw
`ings to refer to the same or like parts. References made to
`particular examples and implementations are for illustrative
`purposes, and are not intended to limit the scope of the inven
`tion or the claims.
`The word “exemplary' is used herein to mean “serving as
`an example, instance, or illustration.” Any implementation
`described herein as “exemplary' is not necessarily to be con
`Strued as preferred or advantageous over other implementa
`tions.
`The terms “mobile device' and “computing device' are
`used interchangeably herein to refer to any one or all of
`personal mobile television receivers, cellular telephones, per
`45
`sonal data assistants (PDAs), multimedia Internet enabled
`cellular telephones (e.g., the Blackberry(R), Google(R)
`Android R compatible phones, Apple(RI-Phones.(R), etc.), tab
`let computers, palm-top computers, laptop computers, net
`books, and similar personal electronic devices which include
`a programmable processor and operate under battery power
`such that power conservation methods are of benefit.
`Computer program code or “program code” for execution
`on a programmable processor for carrying out operations of
`the various aspects may be written in a high level program
`ming language Such as C, C++, C#, JAVA. Smalltalk, JavaS
`cript, J----, Visual Basic, TSQL, Perl, or in various other
`programming languages. Programs for Some target processor
`architecture may also be written directly in the native assem
`bler language. A native assembler program uses instruction
`mnemonic representations of machine level binary instruc
`tions. Program code or programs stored on a computer read
`able storage medium as used herein refers to machine lan
`guage code such as object code whose format is
`understand