throbber
Proceedings
`
`SEVENTH ANNUAL
`IEEE SYMPOSIUM
`ON
`
`Intel Exhibit 1011 - 1
`
`1
`2
`3
`4
`5
`6
`7
`8
`9
`10
`11
`12
`13
`14
`15
`16
`17
`18
`19
`20
`21
`22
`23
`24
`25
`26
`27
`28
`29
`30
`31
`32
`33
`34
`35
`36
`37
`38
`39
`40
`41
`42
`43
`44
`45
`46
`47
`48
`49
`50
`51
`52
`53
`54
`55
`56
`
`

`

`Proceedings
`
`SEVENTH ANNUAL
`IEEE SYMPOSIUM
`ON
`
`A
`
`April 21 - 23, 1999
`Napa Valley, California
`
`Sponsored by
`IEEE Computer Society Technical Committee on
`Computer Architecture
`
`Edited by
`Kenneth L. Pocek and Jeffrey M. Arnold
`
`SOCIETY
`
`Los Alamitos, California
`
`Washington
`
`Brussels
`
`Tokyo
`
`Intel Exhibit 1011 - 2
`
`1
`2
`3
`4
`5
`6
`7
`8
`9
`10
`11
`12
`13
`14
`15
`16
`17
`18
`19
`20
`21
`22
`23
`24
`25
`26
`27
`28
`29
`30
`31
`32
`33
`34
`35
`36
`37
`38
`39
`40
`41
`42
`43
`44
`45
`46
`47
`48
`49
`50
`51
`52
`53
`54
`55
`56
`
`

`

`Copyright 0 1999 by The Institute of Electrical and Electronics Engineers, Inc.
`All rights reserved
`
`Copyright and Reprint Permissions: Abstracting is permitted with credit to the source. Libraries may
`photocopy beyond the limits of US copyright law, for private use of patrons, those articles in this volume
`that carry a code at the bottom of the first page, provided that the per-copy fee indicated in the code is paid
`through the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923.
`
`Other copying, reprint, or republication requests should be addressed to: IEEE Copyrights Manager, IEEE
`Service Center, 445 Hoes Lane, P.O. Box 133, Piscataway, NJ 08855-1331.
`
`The papers in this book comprise the proceedings of the meeting mentioned on the cover and title page.
`They reflect the authors’ opinions and, in the interests of timely dissemination, are published as presented
`and without change. Their inclusion in this publication does not necessarily constitute endorsement by the
`editors, the IEEE Computer Society, or the Institute of Electrical and Electronics Engineers, Inc.
`
`IEEE Computer Society Order Number PRO0375
`ISBN 0-7695-0375-6
`ISBN 0-7695-0377-2 (microfiche)
`IEEE Order Plan Catalog Number PRO0375
`ISSN 1082-3409
`
`Additional copies may be ordered from:
`
`IEEE Computer Society
`Customer Service Center
`10662 Los Vaqueros Circle
`P.O. Box 3014
`Los Alamitos, CA 90720- I3 I4
`Tel: + 1-7 14-82 1-8380
`Fax: + 1-7 14-82 1-464 1
`E-mail: cs.books@computer.org
`
`IEEE Service Center
`445 Hoes Lane
`P.O. Box 1331
`Piscataway, NJ 08855- 133 1
`Tel: + 1-732-98 I - 1393
`Fax: + 1-732-98 1-9667
`mis.custserv@computer.org
`
`IEEE Computer Society
`Ooshima Building
`1-4-2 Minami-Aoyama
`Minato-ku, Tokyo 107-0062
`JAPAN
`Tel: + 81-3-3408-31 18
`Fax: + 81-3-3408-3553
`Tokyo.ofc@computer.org
`
`Editorial production by Thomas Baldwin
`Cover art design by Joseph DaigleKtudio Productions
`Printed in the United States of America by The Printing House
`
`SOCIETY
`
`Intel Exhibit 1011 - 3
`
`1
`2
`3
`4
`5
`6
`7
`8
`9
`10
`11
`12
`13
`14
`15
`16
`17
`18
`19
`20
`21
`22
`23
`24
`25
`26
`27
`28
`29
`30
`31
`32
`33
`34
`35
`36
`37
`38
`39
`40
`41
`42
`43
`44
`45
`46
`47
`48
`49
`50
`51
`52
`53
`54
`55
`56
`
`

`

`x
`
`.2
`
`12
`
`25
`
`Table of Contents
`SEVENTH ANNUAL IEEE SYMPOSIUM ON FIELD-PROGRAMMABLE
`CUSTOM COMPUTING MACHINES (FCCM’99)
`Co-chairs & Program Committee ..............................................................................................
`-SESSION 1: TOOLS 1
`CHAIR: Stephen Smith
`Macro-Based Hardware Compilation of JavaTM Bytecodes into a Dynamic
`Reconfigurable Computing System .............................................................................................
`J.M.P. Cardoso, H.C. Net0
`A CAD Suite for High-Performance FPGA Design ....................................................................
`B. Hutchings, P. Bellows, J. Hawkins, S. Hemmert, B. Nelson, M. Rytting
`Formal Verification of Reconfigurable Cores ..............................................................................
`S. Singh, C.J. Lillieroth
`- SESSION 2: NETWORK APPLICATIONS
`CHAIR: Mark Shand
`Transmutable Telecom System and Its Application ...................................................................
`T. Miyazaki, T. Murooka, M. Katayama, A. Takahara
`Implementation and Evaluation of a Prototype Reconfigurable Router .....................................
`J.R. Hess, D.C. Lee, S.J. Harper, M.T. Jones, P.M. Athanas
`- SESSION 3: COMPILATION
`CHAIR: Andrk DeHon
`Pipeline Vectorization for Reconfigurable Systems ...................................................................
`M. Weinhardt, W. Luk
`Automatic Allocation of Arrays to Memories in FPGA Processors with
`Multiple Memory Banks ............................................................................................................. 63
`M.B. Gokhale, J.M. Stone
`. .
`Parallelizing Applications into Silicon.. ......................................................................................
`J. Babb, M. Rinard, CA. Moritz, W. Lee, M. Frank, R. Barua, S. Amarasinghe
`- SESSION 4: ARCHITECTURES
`CHAIR: Scott Hauck
`Reconfigurable Elements for a Video Pipeline Processor ..........................................................
`M.R. Piacentino, G.S. van der Wal, M. W. Hansen
`
`.70
`
`.34
`
`.44
`
`.52
`
`.82
`
`V
`
`Intel Exhibit 1011 - 4
`
`1
`2
`3
`4
`5
`6
`7
`8
`9
`10
`11
`12
`13
`14
`15
`16
`17
`18
`19
`20
`21
`22
`23
`24
`25
`26
`27
`28
`29
`30
`31
`32
`33
`34
`35
`36
`37
`38
`39
`40
`41
`42
`43
`44
`45
`46
`47
`48
`49
`50
`51
`52
`53
`54
`55
`56
`
`

`

`.92
`
`1 14
`
`123
`
`104
`
`ConCISe: A Compiler-Driven CPLD-Based Instruction Set Accelerator ..................................
`B. Kastrup, A. Bink, J. Hoogerbrugge
`- SESSION 5: TOOLS 2
`CHAIR: Roger Woods
`CPR: A Configuration Profiling Tool ........................................................................................
`S. Cadambi, S.C. Goldstein
`Debugging Techniques for Dynamically Reconfigurable Hardware .........................................
`N McKay, S. Singh
`Improving Simulation Accuracy in Design Methodologies for Dynamically Reconfigurable
`Logic Systems .............................................................................................................................
`M. Vasilko. D. Cabanis
`- SESSION 6: GRAPHICS APPLICATIONS
`CHAIR: Herman Schmit
`Reconfigurable Computing for Augmented Reality ..................................................................
`W: Luk, T.K. Lee, J.R. Rice, N. Shirazi, P. Y.K. Cheung
`Sepia: Scalable 3D Compositing using PCI Pamette .................................................................
`L. Moll, A. Heirich, M. Shand
`- SESSION 7: APPLICATIONS
`CHAIR: Mike Butts
`An Edge-Endpoint-Based Configurable Hardware Architecture for
`VLSI CAD Layout Design Rule Checking ...............................................................................
`Z. Luo, M. Martonosi, P. Ashar
`FAFNER-Accelerating Nesting Problems with FPGAs ............................................................
`J. C. Alves, J. C. Ferreira, C. Albuquerque, J.F. Oliveira, J.S. Ferreira, J. Silva Matos
`- SESSION 8: DSP APPLICATIONS
`CHAIR: Phil Kuekes
`Field Programmable Gate Array Based Radar Front-End Digital Signal Processing ............... 178
`T.J. Moeller, D.R. Martinez
`Optimizing FPGA-Based Vector Product Designs ....................................................................
`D. Benyamin, W: Luk, J. Villasenor
`
`136
`
`146
`
`158
`
`168
`
`1 88
`
`vi
`
`Intel Exhibit 1011 - 5
`
`1
`2
`3
`4
`5
`6
`7
`8
`9
`10
`11
`12
`13
`14
`15
`16
`17
`18
`19
`20
`21
`22
`23
`24
`25
`26
`27
`28
`29
`30
`31
`32
`33
`34
`35
`36
`37
`38
`39
`40
`41
`42
`43
`44
`45
`46
`47
`48
`49
`50
`51
`52
`53
`54
`55
`56
`
`

`

`209
`
`.232
`
`240
`
`.222
`
`-. SESSION 9: RUN TIME SYSTEMS
`CHAIR: Satnam Singh
`PCI-PipeRench and the SWORDAPI: A System for Stream-based Reconfigurable Computing 200
`R. Laufer, R.R. Taylor, H. Schmit
`Safe and Protected Execution for the MorpWAMRM Reconfigurable Processor .....................
`A.A. Chien, J.H. Byun
`Implementing an API for Distributed Adaptive Computing Systems .......................................
`M. Jones, L. ScharJI J. Scott, C. Twaddle, M. Yaconis, K. Yao, P. Athanas, B. Schott
`- SESSION 10: ARITHMETIC
`CHAIR: Steve Casselman
`A Super-serial Galois Fields Multiplier for FPGAs and its Application to
`Public-Key Algorithms ............................................................................................................
`G. Orlando, C. Paar
`Automatic Floating to Fixed Point Translation and its Application to
`Post-Rendering 3D Warping .....................................................................................................
`M.P. Leong, M. Y. Yeung, C. K. Yeung, C. FY Fu, P.A. Heng, P.H. FY Leong
`Dynamic Precision Management for Loop Computations on Reconfigurable Architectures.. . .249
`K. Bondalapati, V. K. Prasanna
`- POSTER SESSION 1
`Accelerating Run-Time Reconfiguration on FCCMs ................................................................ 260
`J. -P. Heron, R.F. Woods
`A Virtual Hardware Handler for RTR Systems .........................................................................
`R. Turner, R.F. Woods, S. Sezer, J.-P. Heron
`Algorithm Analysis and Mapping Environment for Adaptive
`Computing Systems: Further Results ........................................................................................
`E.K. Pauer, P.D. Fiore, J.M. Smith
`Development System for FPGA-Based Digital Circuits ...........................................................
`V. Sklyarov, J. Fonseca, R. Monteiro, A. Oliveira, A. Melo,
`N. Lau, I. Skliarova, P. Neves, A. Ferrari
`Design of a JTAG Based Run Time Reconfigurable System ....................................................
`C. Cousineau, F. Laperle, Y. Savaria
`Architectures for System-Level Applications of Adaptive Computing ....................................
`B. Schott, C. Chen, S. Crago, J. Czarnaski, M. French, I. Hom, T. Tho, T. Valenti
`Task-level Partitioning and RTL Design Space Exploration for Multi-FPGA Architectures .. .272
`V. Srinivasan, R. Vemuri
`Enabling Automatic Module Generation for FCCM Compilers ................................................
`A. Koch
`
`262
`
`264
`
`.266
`
`268
`
`.2'70
`
`274
`
`vii
`
`Intel Exhibit 1011 - 6
`
`1
`2
`3
`4
`5
`6
`7
`8
`9
`10
`11
`12
`13
`14
`15
`16
`17
`18
`19
`20
`21
`22
`23
`24
`25
`26
`27
`28
`29
`30
`31
`32
`33
`34
`35
`36
`37
`38
`39
`40
`41
`42
`43
`44
`45
`46
`47
`48
`49
`50
`51
`52
`53
`54
`55
`56
`
`

`

`278
`
`282
`
`284
`
`286
`
`290
`
`.294
`
`.298
`
`.300
`
`304
`
`-. POSTER SESSION 2
`ICARUS: A Dynamically Reconfigurable Computer Architecture ...........................................
`M. Baxter
`SONIC-A Plug-In Architecture for Video Processing ............................................................ 280
`S.D. Haynes, P. Y.K. Cheung, K Luk, J. Stone
`A Reconfigurable Platform for Academic Purposes ..................................................................
`C. Teuscher, J.-0. Haenni, F.J. Gbmez, H.F. Restrepo, E. Sanchez
`VHDL Placement Directives for Parametric IP Blocks .............................................................
`J. Hwang, C. Patterson, S. Mitra
`Runlength Compression Techniques for FPGA Configurations ................................................
`S. Hauck, KD. Wilson
`- POSTER SESSION 3
`Accelerating An IR Automatic Target Recognition Application with FPGAs ..........................
`J. Jean, X Liang, B. Drozd, K. Tomko
`Mapping of an Automated Target Recognition Application from a Graphical Software
`Environment to FPGA-based Reconfigurable Hardware .........................................................
`B. Levine, S. Natarajan, C. Tan, D. Newport, D. Bouldin
`Hybrid DatdConfiguration Caching for Striped FPGAs ..........................................................
`D. Deshpande, A.K. Somani, A. Tyagi
`On Reconfiguring Cache for Computing ................................................................................... 296
`H.-S. Kim, A.K. Somani, A. Tyagi
`Reconfigurable Pipelines in VLIW Execution Units ................................................................
`R.D. Williams, B.D. Kuebert
`Fast Online Placement for Reconfigurable Computing Systems ..............................................
`K. Bazargan, M. Sarrafzadeh
`POSTER SESSION 4
`A Compact Fast Variable Key Size Elliptic Curve Cryptosystem Coprocessor ........................
`L. Gao, S. Shrivastava, H. Lee, G.E. Sobelman
`A Virtual Logic Algorithm for Solving Satisfiability Problems Using
`Reconfigurable Hardware ......................................................................................................... 306
`M. Abramovici, J. T. de Sousa
`Reducing Compilation Time of Zhong’s FPGA-based SAT solver ...........................................
`P.K. Chan, M.J. Boyd, S. Goren, K. Klenk, V. Kodavati, R. Kundu, M. Margolese,
`J. Sun, K. Suzuki, E. Thorne, X Wang, J. Xu, M. Zhu
`FPGA-based Structures for On-line FFT and DCT ..................................................................
`D. Lau, A. Schneider, M.D. Ercegovac, J. Villasenor
`
`.292
`
`308
`
`.3 10
`
`...
`Vlll
`
`Intel Exhibit 1011 - 7
`
`1
`2
`3
`4
`5
`6
`7
`8
`9
`10
`11
`12
`13
`14
`15
`16
`17
`18
`19
`20
`21
`22
`23
`24
`25
`26
`27
`28
`29
`30
`31
`32
`33
`34
`35
`36
`37
`38
`39
`40
`41
`42
`43
`44
`45
`46
`47
`48
`49
`50
`51
`52
`53
`54
`55
`56
`
`

`

`An FPGA-based Fan Beam Image Reconstruction Module ......................................................
`L. Maltar, F.M.G. Franqa, V.C. Alves, C.L. Amorim
`Bezier Curve Rendering on VirtexTM ..... .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3 14
`D. MacVicar, S. Singh, R. Slous
`
`3 12
`
`Author Index .............................................................................................................................
`
`3 18
`
`ix
`
`Intel Exhibit 1011 - 8
`
`1
`2
`3
`4
`5
`6
`7
`8
`9
`10
`11
`12
`13
`14
`15
`16
`17
`18
`19
`20
`21
`22
`23
`24
`25
`26
`27
`28
`29
`30
`31
`32
`33
`34
`35
`36
`37
`38
`39
`40
`41
`42
`43
`44
`45
`46
`47
`48
`49
`50
`51
`52
`53
`54
`55
`56
`
`

`

`Safe and Protected Execution for the Morph/AMRM Reconfigurable
`Processor
`
` Jay H. Byun
` Andrew A. Chien
` Department of Computer Science and Engineering
` Department of Computer Science
` University of California, San Diego
` University of Illinois at Urbana-Champaign
` achien@cs.ucsd.edu
` jaybyun@cs.uiuc.edu
`
`April 1, 1999
`
`Abstract
`
`Technology scaling of CMOS processes brings relatively faster
`transistors (gates) and slower interconnects (wires), making viable
`the addition of reconfigurability to increase performance. In the
`Morph/AMRM system, we are exploring
`the addition of
`reconfigurable logic, deeply integrated with the processor core,
`employing the reconfigurability to manage the cache, datapath,
`and pipeline resources more effectively. However, integration of
`reconfigurable logic introduces significant protection and safety
`challenges for multiprocess execution. We analyze the protection
`structures in a state of the art microprocessor core (R10000),
`identifying the few critical logic blocks and demonstrating that the
`majority of the logic in the processor core can be safely
`reconfigured. Subsequently, we propose a protection architecture
`for the Morph/AMRM reconfigurable processor which enable
`nearly the full range of power of reconfigurability in the processor
`core while requiring only a small number of fixed logic features
`which to ensure safe, protected multiprocess execution.
`
`1. Introduction
`
`Trends in semiconductor technology suggest that the
`use of reconfigurable logic blocks within the processor will
`be desirable in the future. Projections from Semiconductor
`Industry Association(SIA) for the year 2007 indicate
`advanced semiconductor processes using 0.1 micron feature
`sizes [1]. However, this feature size, as measured by
`transistor channel length, is of decreasing importance to
`logic and circuit as well as processor speed. In systems of
`that era, logic density, logic speed, and processor speed will
`be dominated by interconnect performance and wiring
`density. For 2007, the SIA projects pitch for the finest
`interconnect at 0.4-0.6 microns. Between logic blocks,
`average interconnect lengths typically range from from
`1,000x to 10,000x pitch -- up to 6mm of intra-chip
`interconnect
`length.
` For such an
`interconnect,
`the
`achievable global clock speed would be
`limited
`to
`approximately 1 nanosecond. Within a few technology
`generations, a crossover will occur, and the average
`interconnect delay will surpass logic block delays --
`projections
`indicate that by
`the year 2007, average
`interconnect delay can be equivalent to five gate delays.
`
`
`
` 1
`
`Once past the cross-over point, dynamic interconnect
`(reconfigurable interconnect or logic) can be introduced at
`modest impact even on critical timing paths[2]. In such
`systems, the dynamic configurability in the processor can
`be used
`to significant advantage [4, 5],
`improving
`performance by factors of 10 to 100x for computational
`kernels while avoiding the traditional disadvantages of
`custom computing approaches such as I/O coprocessor
`coupling and slower
`logic [6].
` In
`these systems,
`reprogrammable
`logic
`blocks will
`replace
`static
`interconnects in the processor core, paving the way for a
`new class of architectures which are customized to the
`application, delivering more robust and higher performance.
`
`Figure 1. Reconfigurability in the processor core and the extended
`application to fixed hardware interface
`
`Reconfigurable, or application adaptive processors
`allow customization of mechanisms, bindings, and policies
`on a per application basis. While current microprocessors
`implement a number of aggressive architectural techniques
`such as speculative execution, branch prediction, block
`prefetching, multi-level caching, etc. to achieve higher
`execution speeds, these mechanisms and policies are tuned
`
`Authorized licensed use limited to: University of Texas at Austin. Downloaded on July 24,2020 at 16:51:20 UTC from IEEE Xplore. Restrictions apply.
`
`Intel Exhibit 1011 - 9
`
`1
`2
`3
`4
`5
`6
`7
`8
`9
`10
`11
`12
`13
`14
`15
`16
`17
`18
`19
`20
`21
`22
`23
`24
`25
`26
`27
`28
`29
`30
`31
`32
`33
`34
`35
`36
`37
`38
`39
`40
`41
`42
`43
`44
`45
`46
`47
`48
`49
`50
`51
`52
`53
`54
`55
`56
`
`

`

`for a broad suite of applications (e.g. SPEC), and thus
`cannot be tightly matched to the needs of a particular
`application, procedure, or even loop in an application. For
`example, the cache block size and organization is chosen to
`maximize performance over a suite of applications, but may
`not give best performance on any particular application.
`Similar constraints apply to other performance critical
`aspects such as value prediction, branch prediction, and
`data movement. In contrast, a processor incorporating
`reconfigurability can adopt optimal policies (and in some
`cases better mechanisms) for the application, enabling
`increased execution efficiency. Thus, the reconfigurable
`logic can used to tune the processor to better match the
`application, rather than the more traditional view of
`thinking of it as an add-on coprocessor. One example of
`this per-application basis tuning would be to adapt the
`cache
`line size
`to maximize performance
`for
`that
`application[3]. This approach
`is embodied
`in
`the
`Morph/AMRM
`(Adaptive Memory Reconfiguration
`Management) architecture [4, 5], and the basic change in
`perspective is that the reconfigurable hardware is an
`extension of
`the application program, extending
`the
`application -- fixed hardware interface to enable more
`efficient execution. The fixed hardware then has a
`somewhat richer (and in parts lower level) interface as
`shown in Figure 1. Studies of Morph/AMRM have
`demonstrated that performance increases of ten to 100 times
`are possible [5]. In essence, this is an extension of the
`application binary interface (so-called ABI), but need not
`be a non-portable extension of the application programming
`interface (API) if appropriate CAD support is available.
`This approach is similar to that which has recently gained
`popularity in the software design community as "open
`implementations"
`[7]
`in which
`software architects
`recognize
`the need
`to open
`the
`implementation for
`customization for particular application uses in order to
`achieve adequate performance.
`Introducing application-controlled reconfigurability in
`the processor raises significant challenges for ensuring
`process isolation and protection (multiprocess isolation), a
`critical element of robust desktop and to an increasing
`degree, embedded computing
`systems. Multiprocess
`isolation is an essential modularity element in software
`systems: without the guarantee of safely isolated and
`protected processes, the system can never be robust since
`software faults cannot be contained and the system cannot
`be safely extended. It is essential for robust reconfigurable
`computing that an application's customization only affect its
`computation, not that of other applications. For example, if
`application-defined hardware were allowed to control
`hardware addressing,
`it
`could allow unauthorized
`corruption of operating system data or even the data of
`other application processes. If an application-defined
`hardware were allowed to control data prefetching, it could
`swamp the memory system with spurious requests. If
`
`
`
` 2
`
`application-defined hardware were allowed to control
`privilege mode changes, it could compromise all traditional
`protection structures.
`the protection structures of
` Our study examines
`traditional processors and operating systems, and based on
`these lessons, proposes a safe multiprocess execution
`architecture for reconfigurable systems. We analyze in
`detail the software and hardware mechanisms central to the
`process protection in conventional processors and OS,
`specifically studying the MIPS R10000 [8] microprocessor,
`an exemplar of a system employing Unix/RISC protection
`architecture. This study elucidates the key mechanisms and
`architectural features for Unix style two mode protection,
`and addressing based isolation. The key feature of this
`protection architecture is process isolation via address
`isolation and mediation. Specifically,
`1. All access to hardware devices is mediated by the
`operating system,
`2. The operating system manages address translation to
`isolate processes,
`3. Application processes cannot change the address
`translation information,
`4. Application processes cannot substitute other
`translation information,
`5. All application accesses are subject to this
`translation, and
`6. The hardware ensures these guarantees
`
`the Morph/AMRM
`describe
`subsequently
`We
`architecture, outlining the dimensions of configurability and
`the hazards for multiprocess protection they induce. For the
`Morph/AMRM system, we then describe the protection
`architecture, describing in detail how each of the key
`properties of the operating system / processor protection
`architecture are provided. The key elements of this
`protection architecture are:
`1. A hardwired control processor which controls
`instruction sequence and privilege mode transitions
`2. A hardwired control processor to TLB control for
`address translation and TLB entry management
`3. A requirement for all other configurable elements
`(system chip sets, input/output devices, memory
`controllers) must deal in virtual addresses, and their
`accesses are checked by local TLBs
`4. Controlled access to key shared interconnects such as
`the system bus are controlled by hardwired arbiters
`which are not changed, system reserves highest priority
`to allow preemption for these resources
`
`This architecture enables configurability in the processor
`complex because it can ensure multiprocess protection (safe
`configuration). We also believe it enables much of the
`useful configurability in the processor complex, notably
`policies for improving efficient management of resources
`and even the addition of instructions, special functional
`
`Authorized licensed use limited to: University of Texas at Austin. Downloaded on July 24,2020 at 16:51:20 UTC from IEEE Xplore. Restrictions apply.
`
`Intel Exhibit 1011 - 10
`
`1
`2
`3
`4
`5
`6
`7
`8
`9
`10
`11
`12
`13
`14
`15
`16
`17
`18
`19
`20
`21
`22
`23
`24
`25
`26
`27
`28
`29
`30
`31
`32
`33
`34
`35
`36
`37
`38
`39
`40
`41
`42
`43
`44
`45
`46
`47
`48
`49
`50
`51
`52
`53
`54
`55
`56
`
`

`

`units, or even processor state. The model provided to
`application programs is a private, configurable, virtual
`machine which enables rich application customization.
`These applications (and their customizations) are cleanly
`isolated.
`The remainder of the paper is organized as follows.
`Section 2 describes
`the basic problem of protected
`execution and process isolation in computer systems.
`Section 3 describes our analysis of the software and
`hardware mechanisms central to the process protection in
`conventional processors and operating systems. Section 4
`discusses the implications of reconfigurability on process
`protection and identifies the key requirement for safe
`process isolation in reconfigurable processors. In Section 5,
`we describe the Morph/AMRM system and a proposed
`protection architecture that meets these requirements set
`forth in Section 3. Section 6 discusses alternate approaches
`and the limitations on configurability imposed by the
`Morph/AMRM protection architecture.
` Sections 7
`summarizes future work and the material covered in this
`paper.
`
`2. Process Isolation: the Problem
`
`Figure 2: Multiprocess Protection based on Address Space
`Isolation
`
`To understand the challenges of multiprocess isolation,
`it is instructive to first consider the possible modalities in
`which multiprocess isolation can be compromised. In the
`simplest mode, an application corrupts the data of another,
`causing it to fail or compute incorrectly. In a more complex
`mode, the application somehow locks up the machine, so no
`other application state is damaged, but neither can the
`machine make progress. One example of this would be
`jamming the memory bus or defeating the timer interrupt
`which ensures preemption. A more serious failure mode is
`to corrupt the operating system's data, which can lead to a
`machine crash
`in which all applications have data
`corruption. Finally, an application could also corrupt
`
`
`
` 3
`
`the operating
`input/output device state, confounding
`system, the device (leading to data loss or misdirection), or
`application data itself. In all of these cases, the failure is
`the result of allowing an application action which can affect
`the machine hardware state, other application memory state,
`or operating system state.
`The key issue in safe multiprocess execution is to
`control access to hardware resources, ensuring that these
`accesses are non-interfering. In general, access to main
`memory, as well as other architecturally visible state
`(processor data registers, control registers), system chip
`registers, and input/output device state must be controlled.
`Traditional approaches partition memory access, virtualize
`resources
`such as processor data
`resources with
`multitasking, and use operating system calls to mediate
`operations which require access to control registers, system
`chip sets, input/output device state, etc. Note that isolation
`and virtualization must apply to any resource at any level
`that a process can claim its ownership. The final piece of
`the puzzle is that in order to support the virtualization and
`multitasking, transitions between the different entities must
`be carefully controlled to prevent compromise.
`
`3. Process Isolation in the MIPS R10000
`
`The key issue in maintaining a safe multiprocessing
`environment is ensuring process isolation: the processor
`and the OS must prevent independent processes from
`interfering with the data and memory of each other and of
`the operating system kernel. They must also prevent a
`malicious process from taking over the processor and
`locking up the system.
`Through a detailed analysis of the R10000 architecture
`and operating
`system, we
`identify
`the hardware
`mechanisms and OS software structures that are central to
`process isolation. We chose the MIPS R10000 processor as
`an exemplar of a modern RISC processor that supports a
`relatively simple UNIX style protection structure [9]. We
`first examine how a UNIX style operating system ensures
`process
`isolation and
`thereby derive
`the hardware
`requirements it imposes. Then identify the corresponding
`support
`in
`the R10000 processor. In
`the following
`discussion, we assume that the address translation is on a
`simple paging system. Most of today’s systems actually
`employ multiple-paging or segmented paging but the
`address translation mechanism is fundamentally the same as
`a simple paging system.
`
`3.1 Operating System-based Process
`Protection
`
`3.1.1 Application and Operating System Memory
`Isolation
`
`Authorized licensed use limited to: University of Texas at Austin. Downloaded on July 24,2020 at 16:51:20 UTC from IEEE Xplore. Restrictions apply.
`
`Intel Exhibit 1011 - 11
`
`1
`2
`3
`4
`5
`6
`7
`8
`9
`10
`11
`12
`13
`14
`15
`16
`17
`18
`19
`20
`21
`22
`23
`24
`25
`26
`27
`28
`29
`30
`31
`32
`33
`34
`35
`36
`37
`38
`39
`40
`41
`42
`43
`44
`45
`46
`47
`48
`49
`50
`51
`52
`53
`54
`55
`56
`
`

`

`Application and operating system memory isolation is
`achieved through controlled address translation. The
`physical memory of each process is isolated by having
`process's virtual address space pages map to its own
`physical memory frames only. To protect processes from
`modification by other processes, the memory-management
`hardware and the OS must prevent programs from changing
`their own address mappings. The UNIX kernel, for
`example, runs in a privileged mode (kernel mode or system
`mode) in which memory mapping may be controlled,
`whereas application processes run in an unprivileged mode
`(user mode). The page tables, mapping information for each
`process reside in the memory space of the kernel so that
`they can only be modified by the OS running in kernel
`mode This address translation control to ensure isolation is
`achieved through the following mechanisms in UNIX [9,
`10].
`
`1. Locating correct translation information for each process.
`By using a special page table base register(ptbr) which
`is set from the process control block(PCB) on each
`process switch, the OS can correctly locate the page
`table for the executing process. Then the index portion
`of the virtual address is added to the address pointed to
`by the ptbr to locate the appropriate page table entry
`(PTE).
`2. Distinguishing valid and invalid entries in page tables.
`Notice that the page table can contain entries that are
`not used by
`the process. These unused entries
`correspond to the pages that are not in the process’s
`logical address space and thus compromise process
`isolation. The OS uses valid-invalid bits to distinguish
`these entries. Alternatively, the page table can also be
`implemented to contain only the entries that are
`actually used by the process. This implementation will
`require a special register containing the length of the
`process's page table, usually called page-table length
`register(PTLR). PTLR can be used to check if page
`index portion of the virtual address is in the range and
`therefore
`is not
`accessing
`illegal
`translation
`information.
`3. Controlling access types
`While the address translation to physical memory
`frames can be valid, the access to those physical
`memory frames are unlimited; the process can read,
`write, and execute them. It will be safer and more
`efficient if we can control the type of access to them.
`The protection bit field in the PTE provides this access
`control information. At the same time that the physical
`address is being computed, the protection bits can be
`checked to verify that no accesses not granted are
`being made. These bits usually indicate whether the
`process can read/write, read-only, or execute-only.
`The type and the number of the protection bits
`provided are dependent on the underlying processor.
`
`
`
` 4
`
`4. Managing TLB consistency.
` The translation information, namely the PTE, is
`cached in the processor's TLB to avoid extra memory
`access to the page table. Using special privileged
`instructions, OS updates the TLB with consistent
`mapping information when a miss occurs. But notice
`that after a context switch, although the new page table
`is pointed to by the n

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket