`
`International Journal of Computer Applications (0975 – 8887)
`Volume 90 – No 4, March 2014
`
`Mahendra Pratap Singh
`Research Scholar,
`Department of Computer Science, Mohanlal
`Sukhadia University, Udaipur, India
`
`
`ABSTRACT
`Mobile phone has become a vital component of our daily life.
`Technological advancements have resulted in significant
`changes in the processor architecture of mobile phones;
`transforming the typical mobile phones of 1990’s to modern
`smart phones. The design and deployment of mobile
`processors over
`the years
`is
`largely
`affected by
`Communication, performance, and low-power operation. The
`transition from analog to digital telephony has resulted in
`mobile devices delivering a wide range of data services. To
`support these services, processor architecture has now become
`much more complex. Mobile processors are growing rapidly
`with each passing generation. The goal of this paper is to
`review various processor architectures for mobile phones.
`
`Keywords
`Processor architecture, DSP, VLIW, SoC, ARM processors
`
`1. INTRODUCTION
`Smartphone hardware mainly consists of application
`processors (System-on-a-Chip), RAM (mobile SDRAM/
`mobile DDR), DSP, CPU (ARM processor), etc. According
`to Farinaz et al [1], Processors that are used for mobile phones
`are subject to design metrics that emphasize cost, time-to-
`market, and low power. Because of the constrained resources
`of power and cost, and
`the
`real-time computation
`requirements, the processors for use in mobile applications
`possess a number of distinct characteristics such as limited
`programmability.
`
`Processor architecture of mobile devices delivering data
`services must provide support for much more complex user
`interface, dynamic operating environments, and support for
`additional
`services. To provide
`for
`these additional
`requirements, advanced architectures may include multiple
`DSP’s or hardware coprocessors.
`
`2. MOBILE PROCESSOR
`ARCHITECTURE TRENDS
`2.1 Traditional DSP (Digital Signal
`Processor) Architectures
`
`Manoj Kumar Jain, Ph. D
`Associate Professor, Computer Science
`Mohanlal Sukhadia University
`Udaipur, India
`
`There have been two distinct approaches for implementation
`of cellular handsets. One approach emphasizes programmable
`DSP’s, while the other approach utilizes ASIC (Application-
`specific integrated circuit) techniques [6].
`
`DSP is a specialized microprocessor used for mobile phones.
`Historically DSP’s were designed around single stand-alone
`integrated circuits (ICs). Embedded DSP’s are now widely
`adopted for VLSI designs. Programmable DSP’s are dominant
`in wireless handset market for digital cellular telephony.
`
`First generation of mobile communications i.e. 1G systems
`used analog transmission with the limitations of requiring
`more power for transmission and allowing limited users[2].
`
`Global System for Mobile Communications (GSM) standard
`evolved after first generation of mobile communication for
`analog cellular networks. DSP processors form one of the
`most important classes of mobile embedded processors in
`second generation i.e. 2G systems.
`
`DSP architectures are preferred over ASIC due to shorter
`product life cycle and they are extensively used in GSM
`mobiles. Programmable DSP’s provide a cost-effective and
`flexible architecture for cellular telephones. AT&T in 1979
`introduced first DSP, and subsequently Texas Instruments
`came up with other DSP’s.
`
`Traditionally, DSP used Harvard architecture which
`physically separates storage and signal pathway
`for
`instructions and data [2]. This is in contrast with Von
`Neumann Architecture, where data and instructions are stored
`in the same memory. As shown in Figure 1, it requires data
`memory and instruction memory to execute instructions. It
`has separate data and instruction buses allowing simultaneous
`transmission. Subsequently, the output of multiplication unit
`connects to an adder, thereby adding and saving all partial
`results for further processing.
`
`This architecture results in fewer cycles for executing a
`particular function as it enables high memory bandwidth and
`multiple operand operations. Multiply-accumulate (MAC)
`instructions are commonly associated with DSP architectures.
`
`
`
`34
`
`Momentum Dynamics Corporation
`Exhibit 1027
`Page 001
`
`
`
`International Journal of Computer Applications (0975 – 8887)
`Volume 90 – No 4, March 2014
`
`DATA
`MEMORY
`
`CONTROL
`& ADDR
`
`INSTRUCTION
`MEMORY
`
`INSTRUCTION
`
`IN
`
`
`
`
`
`
`
`
`
`ALU
`
`
`
`
`CONTROL
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`OUT
`
`
`CONTROL
`
`
`
`
`
`STATUS
`
`CLOCK
`
`Figure 1: Traditional DSP Architecture (Harvard Architecture)
`
`5
`20
`256
`150
`250
`
`40
`80
`2000
`15
`12.5
`
`5000
`1000
`32000
`5
`0.1
`
`50000
`10000
`1000000
`0.15
`0.001
`
`Table 1: DSP Integration over the Years
`Parameter
`1980
`1990
`2000
`2010
`Die Size (mm)
`50
`50
`50
`5
`Technology
`3
`0.8
`0.1
`0.02
`(micrometers)
`MIPS
`MHz
`RAM (bytes)
`Price (dollars)
`Power
`(mW/MIPS)
`Transistors
`
`2.2 Modern DSP Architectures
`Apart from traditional architectures, some modern DSP
`architectures have evolved for mobile devices. TMS320C55 is
`a modern DSP architecture which implements Harvard
`architecture utilizing one and three read buses for code and
`data, respectively [2]. With TMS320C55 DSP architecture,
`features like programmable idle modes and automatic power
`saving were incorporated for better processor utilization at top
`speeds.
`
`long
`is an example of VLIW (Very
`TMS320C62XX
`instruction word) DSP processor. VLIW architecture in DSP
`provides a compiler based programmer friendly environment.
`These VLIW processors follow explicitly parallel instruction
`computing (EPIC) architectures.
`
`The TigerSHARK DSP architecture has a series of advanced
`features like the use of “short vectors” to process information
`using SIMD (single instruction multiple memory) architecture
`[2].
`
`DSPs have become common in mobile devices because they
`provide real-time operation at low power costs. Future mobile
`devices have to be aligned to integrate more functions,
`considering
`computational
`power
`requirements.
`Advancements in DSP lead to higher clock frequencies as
`well as a reduced power consumption per MIPS for mobile
`phones [3].
`
`One way of customization is parallelism in superscalar
`designs, which is widely implemented for future mobile
`processors. VLIW and SIMD architectures in modern cellular
`devices are becoming more and more popular; because they
`allow reducing the frequency and voltage of the CPU chips
`without losing performance. Modern DSPs can be more
`effective if they are able to support parallel processing.
`
`In modern DSP architectures, computational power has
`greatly improved due to advancements in chip fabrication. As
`shown in Table 1 below, the same DSP chip was providing
`approximately 5 GIPS (Giga Instructions per Second) in 2000,
`in contrast to 5 MIPS in 1980 and it has grown to 50 GIPS in
`the year 2010. Considering other
`factors also,
`the
`advancements in DSP integration is significant.
`
`50000
`
`500000
`
`5
`million
`12
`
`50
`million
`12
`
`3
`
`6
`
`Wafer Size
`(inches)
`
`Source: Past and Projected Evolution of DSP [Gene 2000)
`
`A number of FPGA (Field-programmable gate array)
`architectures have been specifically designed for DSP to
`effectively utilize VLSI resources. VLIW compilers are
`extensively used in DSP architectures to support instruction
`level parallelism [. The trend in modern DSP architectures is
`to develop fault-tolerant and reliable DSP systems. DSP
`systems are reconfigured keeping in mind a number of design
`goals
`like performance, power,
`reliability, cost, and
`development time. Modern DSP systems are implemented
`based on reconfigurable DSP architectures.
`
`In modern DSP’s, architecture can be extended by duplicating
`the processor cores. Enhanced DSP’s utilizes SIMD
`operations, while multiple-issue DSP’s may implement either
`VLIW or superscalar architectures.
`
`2.3 System on Chip (SoC) based
`architectures
`Mobile device processor architecture became simple with
`SOC designs. Real time responsiveness in mobile devices can
`be managed by using an enhanced DSP hybrid chip. Lowering
`the voltage of the chip enables low power operation in mobile
`devices.
`
`35
`
`Momentum Dynamics Corporation
`Exhibit 1027
`Page 002
`
`
`
`International Journal of Computer Applications (0975 – 8887)
`Volume 90 – No 4, March 2014
`
`Floating
`
`32-bit
`
`16-bit
`
`8-bit
`
`Performance and Memory Utilization
`
`
`
`30
`
`25
`
`20
`
`15
`
`10
`
`5
`
`0
`
`Operations per Cycle
`
`
`
`Figure 3: Effect of Instruction Set Customization on
`performance and memory utilization
`
`2.4 ARM Processors for Mobiles
`ARM based processors are the most widely used in modern
`Smart phones. ARM is a32-bit instruction set architecture
`based on RISC architecture [10]. ARM processors are
`particularly used in Smart phones because of its low power
`consumption and great performance.
`
`ARM holdings provide chip design and instruction set
`customization licenses to third party vendors like Apple,
`Qualcomm etc. who design their own products based on the
`provided architecture.
`
`Various ARM architectures used in Smartphone are ARMv5
`utilized in low-end devices, and ARMv6, ARMv7 utilized in
`recent high performance devices. ARMv7
`includes a
`hardware floating-point unit (FPU) providing improved speed.
`The 32-bit ARM architecture, such as ARMv7-A, is the most
`extensively used architecture in mobile devices.
`
`ARM architecture is the main hardware architecture for most
`of the operating systems of mobile devices such as iOS,
`Android, Windows Phone, Windows RT, Bada, Blackberry
`OS/Blackberry10, MeeGo, Firefox
`OS, Tizen, Ubuntu
`Touch, Sailfish and Igelle OS.
`
`3. COMPARATIVE STUDY OF
`CONTEMPORARY MOBILE PHONE
`PROCESSORS
`ARM Cortex, Snapdragon, Nvidia Tegra are among the most
`widely used mobile processors.
`
`3.1 ARM Cortex Processors
`ARM Cortex processors cores are categorized into the
`following variants:
` Cortex-A Processors (ARM Application Processors)
` Cortex-R Processors (ARM Embedded Real-time
`Processors)
` Cortex-M Processors (ARM Embedded Processors)
`
`36
`
`Matthias et al [5] proposed a new scalable DSP architecture
`for SoC domains. In SoC based designs, system tasks can be
`managed by integrating microcontrollers, dedicated ASIC’s,
`or DSP’s in a single chip as shown in Figure 2 below:
`
`Figure 2: Traditional DSP Architecture (Harvard
`Architecture)
`
`
`
`
`
`SoC Chip
`
`Multimedia Decoders/Encoders
`
`
`
`
`
`
`
`
`
`
`
`
`Highly integrated SoC’s leveraging multicore technology has
`emerged for higher performance and low power designs. Low
`power operation often limits the architectural choices. High
`throughput of VLIW architectures in mobile devices requires
`a fast memory system like cache memories.
`
`
`
` Memory
`
`
`
`
`
`
` Storage
`
`Direct Memory Access
`
`
`
`ARM CPU
`
` Digital Signal
`Processor (DSP)
`
`
`
`NIC
`
`
`
`
`
`
`
`AUDI0
`
`USB
`
`VIDEO
`
`To speed up the operation of mobile devices, instruction se
`customizations have been done by many companies.ARM
`Ltd. has done extensive instruction set customization by
`encoding most used instructions in 16-bit, so as to support
`more read-write operations.
`
`The effect of instruction set customization on performance
`and memory utilization can be understood from Figure 3. As
`we customize instruction set from 32-bit data types to 8-bit
`types, we are able to effectively improve memory utilization
`and the overall performance.
`
`reconfigurable processor
`[13] proposed
`Martin at el
`architecture for mobile phones. Dynamically Configurable
`System on Chip (CSoC) architecture has been optimized for
`mobile communications. CSoC’s are customized for a specific
`application and its architecture consists of processor core,
`memory, ASIC cores, and on-chip reconfigurable hardware
`units.
`
`Most of the smart phones today are single or dual-core SoC’s.
`For most of the mobile applications, faster dual-core CPU
`provides better performance than quad-core SoC’s. Future
`SoC’s for mobile will become more sophisticated improving
`the overall performance.
`
`Momentum Dynamics Corporation
`Exhibit 1027
`Page 003
`
`
`
` SecureCore Processors (ARM Secure Processors)
`
`As an example, consider the architecture of ARM Cortex-A8
`depicted in Figure 4. This architecture is based on NEON
`SIMD media and signal processing technology for providing
`audio, video, and 3D graphics to mobile applications.
`Instruction ser architecture of ARM Cortex-A8 implements
`Thumb-2 instruction set encoding consisting of 16-bit long
`instructions which require less external memory. AMBA
`(Advanced Microcontroller Bus Architecture) bus interface
`supports input and output data buses that are either 64 or 128-
`bit wider. It performs L2 cache fills and non-cacheable
`accesses for both instructions and data.
`
`Figure 4: ARM Cortex-A8 Architecture
`
`A single core ARM Cortex A8 processor with 1.4 GHz clock
`speed was considered reasonably enough till 2011. In the year
`2014, ARM processors clocked at 3 GHz will become a
`reality. These upcoming 20 nm manufactured processors are
`
`
`
`International Journal of Computer Applications (0975 – 8887)
`Volume 90 – No 4, March 2014
`
`expected to offer 25% less power consumption and will allow
`up to 30% faster clock speeds.
`
`Mobile computing is gearing up for a drastic change this year
`with the advent of 64-bit ARM based processors which are
`expected to provide up to 50% performance improvement
`over existing 32-bit ARM processors. ARM’s new Cortex-
`A50 processor series based on the ARMv8 architecture
`includes the Cortex-A53 and Cortex-A57 processors. The
`Cortex-A57 is a performance-oriented applications processor,
`while
`the Cortex-A53
`is a power-efficient application
`processor.
`
`In the near future, Android 5.0 will efficiently utilize the new
`ARMv8 64-bit architecture. Apart from the proposed 64-bit
`architectures, upcoming mobile phones will be equipped with
`4 GB of RAM to support increasing complexity of 3D games
`on Android.
`
`3.2 Qualcomm Snapdragon Processors
`Snapdragon is a family of mobile system on a chip (SoC)
`processor architecture provided by Qualcomm. Scorpion, the
`original snapdragon CPU had many features similar to ARM
`Cortex-A8 core based on ARMv7 instruction set, but with an
`added advantage of higher performance utilizing SIMD
`operations.
`
`Qualcomm Snapdragon Soc’s are build around Krait
`processor architecture, shown in Figure 5. It integrates LTE
`(Long Term Evolution) modem
`to support seamless
`connectivity across 2G, 3G and 4G LTE networks. This
`architecture supports a wider front-end, with the ability to
`fetch and decode three instructions per clock. Adreno GPU in
`this architecture delivers
`improved advanced graphics
`performance. With Hexagon DSP’s, this architecture provides
`low power operation for a variety of multimedia applications
`like enhanced audio/video.
`
`Figure 5: Qualcomm Snapdragon Processor Architecture
`
`Qualcomm was one of the first to introduce a 28nm processor
`in 2012 with its Snapdragon S4 series processor. GPU and
`overall architecture are refined inside the Snapdragon 600 and
`800, but they are still utilizing 28nm processors.
`
`In the year 2013, Qualcomm Snapdragon 800 processor with
`Krait 400 CPU cores providing 2.3 GHz clock speed
`outperformed all other processors in the mobile segment.
`
`
`
`37
`
`Momentum Dynamics Corporation
`Exhibit 1027
`Page 004
`
`
`
`As shown in figure 6 below, Snapdragon 800 processor
`consists of 28nm HPm quad core Krait 400 CPU for high
`performance, Adreno 330 GPU for
`improved graphics
`performance, Hexagon DSP for low power operation, and
`Gobi™ True 4G LTE modem for connectivity.
`
`Figure 6: Qualcomm Snapdragon 800 Processor
`Architecture
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`Krait CPU
`
`
`
`Adreno CPU
`
`Multimedia
`
`Hexagon CPU
`
`
`
`Camera
`
`Connectivity
`
`Display / LCD
`
`Navigation
`
`Snapdragon 800 processors are designed to enable incredibly
`fast apps and web browsing, visually stunning graphics,
`breakthrough
`multimedia
`capabilities,
`seamless
`communications virtually anytime, anywhere, and outstanding
`battery life for premium smart phones.
`
`Snapdragon 800 processor provides seamless connected
`computing and a rich mobile experience with advanced
`features like Ultra HD video, multichannel HD audio and
`
`International Journal of Computer Applications (0975 – 8887)
`Volume 90 – No 4, March 2014
`
`advanced imaging.
`3.3 Nvidia Tegra Processors
`Tegra is a SoC series for mobile devices developed by Nvidia.
`It integrates ARM architecture CPU, graphics processing unit
`(GPU), memory controllers, etc. on a single package. It
`enables high performance and low power consumption for
`audio/ video applications.
`
`Nvidia Tegra 4 processor is a quad-core Soc with increased
`GPU cores, faster clock cycles, and improved efficiency. The
`GPU architecture of Tegra 4 is shown in Figure 6. Vertices
`are processed by six VPE (vertex processing engine) units.
`Next, vertices are cached by IDX unit. Vertices are then
`passed to raster engine, which produces pixel fragments.
`Early-Z unit tests pixel fragments for Z-depth and passes only
`visible pixels. Early-Z processing in Tegra 4 GPU architecture
`results in improved performance and power savings. GPU
`includes four pixel fragment shader pipes which implements
`VLIW architecture. Each pixel shader unit also contains a
`texture filtering unit, with their own L1 and L2 cache.
`
`According to an analysis of upcoming ARM processors, it
`was revealed that Nvidia’s Tegra 4 SoC beats the best
`Qualcomm Snapdragon processor in terms of performance.
`Though Nvidia have designed a technically faster SoC, but
`when it comes to power efficiency Qualcomm Snapdragon
`processors have an edge.
`
`launched next generation mobile
`Nvidia has recently
`processor, Tegra K1. It is a mobile processor with 192
`graphics cores for mobile gaming applications. Nvidia K1 was
`launched with a support for two versions: traditional 32-bit
`“4+1” ARM cores like Tegra 4, and dual core 64-bit version.
`It is said that Tegra K1 is even more powerful than either the
`Xbox 360 or the PlayStation 3
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`Vertex
`
`Vertex
`
`Vertex
`
`Vertex
`
`Vertex
`
`Vertex
`
`
`
`
`
`
`IDX / Clip / Setup
`
`
`Raster / Early Z
`
`
`Tex
`
`Tex
`
`
`
`
`
`Tex
`
`Tex
`
`L2
`
`L1
`
`L1
`
`L1
`
`L1
`
`32b FB
`
`Memory
`
`32b FB
`
`Figure 7: Nvidia Tegra 4 GPU Architecture
`
`38
`
`Momentum Dynamics Corporation
`Exhibit 1027
`Page 005
`
`
`
`4. CONCLUSION
`Different vendors are working towards the development of
`more power efficient mobile processor architectures by
`looking at the future of mobile computing. All the modern
`mobile processors are basically ARM-based, designated with
`fancy names by different cellular companies.
`
`With newer versions of mobile CPU’s we will have more
`powerful Smartphone with new GPU cores, memory
`interfaces, and many more advanced features. Future mobile
`SoC’s will explore next generation processor architecture to
`improve the device performance. Mobile processing unit
`manufacturers are working hard to develop powerful cell
`phone devices.
`
`To support next-generation data-centric mobile devices,
`processor architecture has to be designed considering new
`approaches. Still, the development in mobile processors is
`driven by factors such as low-power consumption, user
`interface performance, time to market, etc.
`
`5. REFERENCES
`[1] Farinaz Kaushanfar, Vandana Prabhu, Miodrag Potkonjak,
`Jan M. Rabaey: “Processors for Mobile Applications”.
`
`[2] Dr. Margarita Esponda: “Trends in Hardware Architecture
`for Mobile Devices”.
`
`[3] Shiv Chaturvedi: “The role of digital signal processors
`(DSP)
`for 3G mobile communication
`systems”,
`International Journal on Emerging Technologies 1(1):
`23-26(2010).
`
`[4] “TMS320C62xx CPU and Instruction Set Reference
`Guide”: Texas Instruments.
`
`
`
`International Journal of Computer Applications (0975 – 8887)
`Volume 90 – No 4, March 2014
`
`[5] Matthias H. Weiss, Frank Engel, and Gerhard P. Fettweis:
`“A new Scalable DSP Architecture for System on Chip
`(SOC) Domains”, IEEE International Conference on
`Acoustics, Speech, and Signal Processing.
`
`[6] Alan Gatherer, Trudy Stetzler, Mike McMahan, and Edgar
`Auslander,
`Texas
`Instruments:
`“DSP-Based
`Architectures for Mobile Communications: Past, Present,
`and Future”.
`
`[7] Ravi Managuli, Yongmin Kim: “VLIW Processor
`Architectures and Algorithm Mappings
`for DSP
`Applications”.
`
`[8] VLIW Architectures
`Technology, Inc.”
`
`for DSP: “Berkeley Design
`
`[9] Andrew Fallows and Patrick Ganson: “Smartphone
`Hardware Architecture”.
`
`[10] Leonid Ryzhyk: “The ARM Architecture”.
`
`[11] “ARM Processor Architecture”, SOC Consortium.
`
`[12] Georgescu, M.D: “Evolution of Mobile Processors”,
`Communications, Computers and signal Processing,
`2003. PACRIM. 2003 IEEE Pacific Rim Conference.
`
`[13] Martin Vorbach, Gurgen Becker: “Reconfigurable
`processor architectures for mobile phones”, Parallel and
`Distributed Processing Symposium, 2003. Proceedings.
`International.
`
`[14] Russel Tessier and Wayne Burleson: “Reconfigurable
`Computing for Digital Signal Processing”, Journal of
`VLSI Signal Processing 28, 7–27, 2001.
`
`IJCATM : www.ijcaonline.org
`
`39
`
`Momentum Dynamics Corporation
`Exhibit 1027
`Page 006
`
`