`Architecture Guide
`
`May 1999
`
`Order Number: 245188-001
`
`Oracle-1045 p.1
`Oracle v. Teleputers
`IPR2021-00078
`
`
`
`THIS DOCUMENT IS PROVIDED “AS IS” WITH NO WARRANTIES WHATSOEVER, INCLUDING ANY WARRANTY OF MERCHANTABILITY,
`NONINFRINGEMENT, FITNESS FOR ANY PARTICULAR PURPOSE, OR ANY WARRANTY OTHERWISE ARISING OUT OF ANY PROPOSAL,
`SPECIFICATION OR SAMPLE.
`Information in this document is provided in connection with Intel products. No license, express or implied, by estoppel or otherwise, to any intellectual
`property rights is granted by this document. Except as provided in Intel's Terms and Conditions of Sale for such products, Intel assumes no liability
`whatsoever, and Intel disclaims any express or implied warranty, relating to sale and/or use of Intel products including liability or warranties relating to
`fitness for a particular purpose, merchantability, or infringement of any patent, copyright or other intellectual property right. Intel products are not
`intended for use in medical, life saving, or life sustaining applications.
`Intel may make changes to specifications and product descriptions at any time, without notice.
`Designers must not rely on the absence or characteristics of any features or instructions marked "reserved" or "undefined." Intel reserves these for
`future definition and shall have no responsibility whatsoever for conflicts or incompatibilities arising from future changes to them.
`IA-64 processors may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current
`characterized errata are available on request.
`Copies of documents which have an order number and are referenced in this document, or other Intel literature, may be obtained by calling
`1-800-548-4725, or by visiting Intel’s website at http://www.intel.com.
`Copyright © Intel Corporation, 1999
`*Third-party brands and names are the property of their respective owners.
`
`Oracle-1045 p.2
`Oracle v. Teleputers
`IPR2021-00078
`
`
`
`Contents
`
`1
`
`About the IA-64 Application Developer’s Architecture Guide ..................................1-1
`1.1
`Overview of the IA-64 Application Developer’s Architecture Guide ...................1-1
`1.2
`Terminology........................................................................................................1-2
`1.3
`Related Documents............................................................................................1-3
`
`Part I: IA-64 Application Architecture Guide
`
`2
`
`3
`
`4
`
`Introduction to the IA-64 Processor Architecture ......................................................2-1
`2.1
`IA-64 Operating Environments ...........................................................................2-1
`2.2
`Instruction Set Transition Model Overview.........................................................2-2
`2.3
`IA-64 Instruction Set Features............................................................................2-2
`2.4
`Instruction Level Parallelism...............................................................................2-3
`2.5
`Compiler to Processor Communication..............................................................2-3
`2.6
`Speculation.........................................................................................................2-3
`2.6.1 Control Speculation ...............................................................................2-3
`2.6.2 Data Speculation ...................................................................................2-4
`2.6.3
`Predication ............................................................................................2-4
`Register Stack ....................................................................................................2-5
`2.7
`Branching ...........................................................................................................2-6
`2.8
`Register Rotation................................................................................................2-6
`2.9
`Floating-point Architecture .................................................................................2-6
`2.10
`2.11 Multimedia Support ............................................................................................2-6
`
`IA-64 Execution Environment ......................................................................................3-1
`3.1
`Application Register State ..................................................................................3-1
`3.1.1 Reserved and Ignored Registers...........................................................3-1
`3.1.2 General Registers .................................................................................3-2
`3.1.3
`Floating-point Registers ........................................................................3-3
`3.1.4
`Predicate Registers ...............................................................................3-4
`3.1.5
`Branch Registers...................................................................................3-4
`3.1.6
`Instruction Pointer .................................................................................3-4
`3.1.7 Current Frame Marker...........................................................................3-4
`3.1.8
`Application Registers.............................................................................3-5
`3.1.9
`Performance Monitor Data Registers (PMD).........................................3-9
`3.1.10 User Mask (UM) ..................................................................................3-10
`3.1.11 Processor Identification Registers.......................................................3-10
`Memory ............................................................................................................3-12
`3.2.1
`Application Memory Addressing Model ...............................................3-12
`3.2.2
`Addressable Units and Alignment .......................................................3-12
`3.2.3
`Byte Ordering ......................................................................................3-12
`Instruction Encoding Overview.........................................................................3-14
`Instruction Sequencing.....................................................................................3-15
`
`3.3
`3.4
`
`3.2
`
`IA-64 Application Programming Model .......................................................................4-1
`4.1
`Register Stack ....................................................................................................4-1
`4.1.1 Register Stack Operation ......................................................................4-1
`4.1.2 Register Stack Instructions....................................................................4-3
`Integer Computation Instructions .......................................................................4-4
`
`4.2
`
`IA-64 Application Developer’s Architecture Guide, Rev. 1.0
`
`iii
`
`Oracle-1045 p.3
`Oracle v. Teleputers
`IPR2021-00078
`
`
`
`Arithmetic Instructions...........................................................................4-4
`4.2.1
`Logical Instructions ...............................................................................4-5
`4.2.2
`32-bit Addresses and Integers ..............................................................4-5
`4.2.3
`Bit Field and Shift Instructions...............................................................4-6
`4.2.4
`Large Constants....................................................................................4-7
`4.2.5
`Compare Instructions and Predication ...............................................................4-7
`4.3.1
`Predication ............................................................................................4-7
`4.3.2 Compare Instructions ............................................................................4-8
`4.3.3 Compare Types.....................................................................................4-8
`4.3.4
`Predicate Register Transfers...............................................................4-10
`Memory Access Instructions ............................................................................4-10
`4.4.1
`Load Instructions.................................................................................4-11
`4.4.2
`Store Instructions ................................................................................4-12
`4.4.3
`Semaphore Instructions ......................................................................4-12
`4.4.4 Control Speculation.............................................................................4-13
`4.4.5 Data Speculation.................................................................................4-16
`4.4.6 Memory Hierarchy Control and Consistency.......................................4-20
`4.4.7 Memory Access Ordering....................................................................4-23
`Branch Instructions ..........................................................................................4-24
`4.5.1 Modulo-Scheduled Loop Support........................................................4-26
`4.5.2
`Branch Prediction Hints.......................................................................4-28
`Multimedia Instructions ....................................................................................4-29
`4.6.1
`Parallel Arithmetic ...............................................................................4-29
`4.6.2
`Parallel Shifts ......................................................................................4-30
`4.6.3 Data Arrangement...............................................................................4-31
`Register File Transfers.....................................................................................4-31
`Character Strings and Population Count..........................................................4-33
`4.8.1 Character Strings ................................................................................4-33
`4.8.2
`Population Count.................................................................................4-33
`
`4.3
`
`4.4
`
`4.5
`
`4.6
`
`4.7
`4.8
`
`5.2
`5.3
`
`IA-64 Floating-point Programming Model...................................................................5-1
`5.1
`Data Types and Formats....................................................................................5-1
`5.1.1 Real Types ............................................................................................5-1
`5.1.2
`Floating-point Register Format..............................................................5-2
`5.1.3 Representation of Values in Floating-point Registers ...........................5-2
`Floating-point Status Register............................................................................5-5
`Floating-point Instructions ..................................................................................5-7
`5.3.1 Memory Access Instructions .................................................................5-7
`5.3.2
`Floating-point Register to/from General Register Transfer
`Instructions..........................................................................................5-13
`Arithmetic Instructions.........................................................................5-14
`5.3.3
`5.3.4 Non-Arithmetic Instructions .................................................................5-16
`5.3.5
`Floating-point Status Register (FPSR) Status Field Instructions.........5-17
`5.3.6
`Integer Multiply and Add Instructions ..................................................5-17
`Additional IEEE Considerations .......................................................................5-18
`5.4.1 Definition of SNaNs, QNaNs, and Propagation of NaNs.....................5-18
`5.4.2
`IEEE Standard Mandated Operations Deferred to Software...............5-18
`5.4.3
`Additions beyond the IEEE Standard..................................................5-18
`
`5.4
`
`IA-32 Application Execution Model in an IA-64 System Environment .....................6-1
`6.1
`Instruction Set Modes ........................................................................................6-1
`6.1.1
`IA-64 Instruction Set Execution.............................................................6-2
`
`IA-64 Application Developer’s Architecture Guide, Rev. 1.0
`
`5
`
`6
`
`iv
`
`Oracle-1045 p.4
`Oracle v. Teleputers
`IPR2021-00078
`
`
`
`6.2
`
`6.3
`
`6.4
`
`IA-32 Instruction Set Execution .............................................................6-2
`6.1.2
`Instruction Set Transitions.....................................................................6-3
`6.1.3
`IA-32 Operating Mode Transitions ........................................................6-4
`6.1.4
`IA-32 Application Register State Model..............................................................6-4
`6.2.1
`IA-32 General Purpose Registers .........................................................6-8
`6.2.2
`IA-32 Instruction Pointer........................................................................6-8
`6.2.3
`IA-32 Segment Registers ......................................................................6-9
`6.2.4
`IA-32 Application EFLAG Register ......................................................6-15
`6.2.5
`IA-32 Floating-point Registers.............................................................6-16
`6.2.6
`IA-32 MMX™ Technology Registers ...................................................6-21
`6.2.7
`IA-32 Streaming SIMD Extension Registers .......................................6-22
`Memory Model Overview..................................................................................6-22
`6.3.1 Memory Endianess..............................................................................6-23
`6.3.2
`IA-32 Segmentation.............................................................................6-23
`6.3.3
`Self Modifying Code ............................................................................6-24
`IA-32 Usage of IA-64 Registers........................................................................6-24
`6.4.1
`IA-64 Register Stack Engine ...............................................................6-24
`6.4.2
`IA-64 ALAT..........................................................................................6-24
`6.4.3
`IA-64 NaT/NaTVal Response for IA-32 Instructions............................6-25
`
`7
`
`IA-64 Instruction Reference..........................................................................................7-1
`7.1
`Instruction Page Conventions ............................................................................7-1
`7.2
`Instruction Descriptions ......................................................................................7-2
`
`Part II: IA-64 Optimization Guide
`
`About the IA-64 Optimization Guide............................................................................8-1
`8.1
`Overview of the IA-64 Optimization Guide .........................................................8-1
`
`8
`
`9
`
`9.4
`
`Introduction to IA-64 Programming.............................................................................9-1
`9.1
`Overview ............................................................................................................9-1
`9.2
`Registers ............................................................................................................9-1
`9.3
`Using IA-64 Instructions .....................................................................................9-2
`9.3.1
`Format ...................................................................................................9-2
`9.3.2
`Expressing Parallelism ..........................................................................9-2
`9.3.3
`Bundles and Templates.........................................................................9-3
`Memory Access and Speculation .......................................................................9-3
`9.4.1
`Functionality ..........................................................................................9-4
`9.4.2
`Speculation............................................................................................9-4
`9.4.3 Control Speculation ...............................................................................9-4
`9.4.4 Data Speculation ...................................................................................9-5
`Predication .........................................................................................................9-5
`IA-64 Support for Procedure Calls .....................................................................9-6
`9.6.1
`Stacked Registers .................................................................................9-6
`9.6.2 Register Stack Engine...........................................................................9-6
`Branches and Hints ............................................................................................9-7
`9.7.1
`Branch Instructions................................................................................9-7
`9.7.2
`Loops and Software Pipelining..............................................................9-7
`9.7.3 Rotating Registers.................................................................................9-8
`Summary ............................................................................................................9-8
`
`9.5
`9.6
`
`9.7
`
`9.8
`
`IA-64 Application Developer’s Architecture Guide, Rev. 1.0
`
`v
`
`Oracle-1045 p.5
`Oracle v. Teleputers
`IPR2021-00078
`
`
`
`10
`
`11
`
`12
`
`vi
`
`10.3
`
`10.4
`
`Memory Reference ......................................................................................................10-1
`10.1
`Overview ..........................................................................................................10-1
`10.2
`Non-Speculative Memory References .............................................................10-1
`10.2.1 Stores to Memory................................................................................10-1
`10.2.2 Loads from Memory ............................................................................10-1
`10.2.3 Data Prefetch Hint...............................................................................10-1
`Instruction Dependences .................................................................................10-2
`10.3.1 Control Dependences .........................................................................10-2
`10.3.2 Data Dependences .............................................................................10-3
`Using IA-64 Speculation to Overcome Dependences......................................10-5
`10.4.1 IA-64 Speculation Model .....................................................................10-5
`10.4.2 Using IA-64 Data Speculation .............................................................10-6
`10.4.3 Using Control Speculation in IA-64 .....................................................10-8
`10.4.4 Combining Data and Control Speculation .........................................10-10
`Optimization of Memory References..............................................................10-10
`10.5.1 Speculation Considerations ..............................................................10-10
`10.5.2 Data Interference ..............................................................................10-11
`10.5.3 Optimizing Code Size........................................................................10-12
`10.5.4 Using Post-Increment Loads and Stores...........................................10-12
`10.5.5 Loop Optimization .............................................................................10-13
`10.5.6 Minimizing Check Code.....................................................................10-14
`Summary........................................................................................................10-15
`
`10.5
`
`10.6
`
`Predication, Control Flow, and Instruction Stream .................................................11-1
`11.1
`Overview ..........................................................................................................11-1
`11.2
`Predication .......................................................................................................11-1
`11.2.1 Performance Costs of Branches .........................................................11-1
`11.2.2 Predication in IA-64.............................................................................11-2
`11.2.3 Optimizing Program Performance Using Predication..........................11-3
`11.2.4 Predication Considerations .................................................................11-6
`11.2.5 Guidelines for Removing Branches.....................................................11-8
`Control Flow Optimizations ..............................................................................11-9
`11.3.1 Reducing Critical Path with Parallel Compares...................................11-9
`11.3.2 Reducing Critical Path with Multiway Branches ................................11-11
`11.3.3 Selecting Multiple Values for One Variable or Register with
`Predication ........................................................................................11-11
`11.3.4 Improving Instruction Stream Fetching..............................................11-13
`Branch and Prefetch Hints .............................................................................11-14
`Summary........................................................................................................11-14
`
`11.4
`11.5
`
`11.3
`
`Software Pipelining and Loop Support.....................................................................12-1
`12.1
`Overview ..........................................................................................................12-1
`12.2
`Loop Terminology and Basic Loop Support .....................................................12-1
`12.3
`Optimization of Loops ......................................................................................12-1
`12.3.1 Loop Unrolling .....................................................................................12-2
`12.3.2 Software Pipelining .............................................................................12-3
`IA-64 Loop Support Features...........................................................................12-4
`12.4.1 Register Rotation ................................................................................12-4
`12.4.2 Note on Initializing Rotating Predicates...............................................12-5
`12.4.3 Software-pipelined Loop Branches .....................................................12-5
`12.4.4 Terminology Review............................................................................12-9
`
`12.4
`
`IA-64 Application Developer’s Architecture Guide, Rev. 1.0
`
`Oracle-1045 p.6
`Oracle v. Teleputers
`IPR2021-00078
`
`
`
`13
`
`12.5
`
`12.6
`
`Optimization of Loops in IA-64 .........................................................................12-9
`12.5.1 While Loops.........................................................................................12-9
`12.5.2 Loops with Predicated Instructions....................................................12-11
`12.5.3 Multiple-Exit Loops ............................................................................12-12
`12.5.4 Software Pipelining Considerations...................................................12-14
`12.5.5 Software Pipelining and Advanced Loads.........................................12-15
`12.5.6 Loop Unrolling Prior to Software Pipelining.......................................12-16
`12.5.7 Implementing Reductions..................................................................12-18
`12.5.8 Explicit Prolog and Epilog..................................................................12-19
`12.5.9 Redundant Load Elimination in Loops...............................................12-21
`Summary ........................................................................................................12-22
`
`13.3
`
`Floating-point Applications........................................................................................13-1
`13.1
`Overview ..........................................................................................................13-1
`13.2
`FP Application Performance Limiters ...............................................................13-1
`13.2.1 Execution Latency ...............................................................................13-1
`13.2.2 Execution Bandwidth...........................................................................13-2
`13.2.3 Memory Latency..................................................................................13-2
`13.2.4 Memory Bandwidth..............................................................................13-3
`IA-64 Floating-point Features...........................................................................13-3
`13.3.1 Large and Wide Floating-point Register Set .......................................13-3
`13.3.2 Multiply-Add Instruction .......................................................................13-6
`13.3.3 Software Divide/Square-root Sequence ..............................................13-7
`13.3.4 Computational Models.........................................................................13-8
`13.3.5 Multiple Status Fields ..........................................................................13-9
`13.3.6 Other Features ..................................................................................13-10
`13.3.7 Memory Access Control ....................................................................13-11
`Summary ........................................................................................................13-13
`
`13.4
`
`Part III: Appendices
`
`A
`
`B
`
`C
`
`Instruction Sequencing Considerations .................................................................... A-1
`A.1
`RAW Ordering Exceptions ................................................................................ A-2
`A.2
`WAW Ordering Exceptions................................................................................ A-3
`A.3
`WAR Ordering Exceptions ................................................................................ A-4
`
`IA-64 Pseudo-Code Functions .................................................................................... B-1
`
`C.3
`
`IA-64 Instruction Formats............................................................................................ C-1
`C.1
`Format Summary............................................................................................... C-2
`C.2
`A-Unit Instruction Encodings ............................................................................. C-8
`C.2.1
`Integer ALU .......................................................................................... C-8
`C.2.2
`Integer Compare ................................................................................ C-11
`C.2.3 Multimedia .......................................................................................... C-15
`I-Unit Instruction Encodings ............................................................................ C-18
`C.3.1 Multimedia and Variable Shifts........................................................... C-18
`C.3.2
`Integer Shifts ...................................................................................... C-24
`C.3.3 Test Bit ............................................................................................... C-26
`C.3.4 Miscellaneous I-Unit Instructions........................................................ C-27
`C.3.5 GR/BR Moves .................................................................................... C-29
`C.3.6 GR/Predicate/IP Moves...................................................................... C-29
`C.3.7 GR/AR Moves (I-Unit) ........................................................................ C-30
`
`IA-64 Application Developer’s Architecture Guide, Rev. 1.0
`
`vii
`
`Oracle-1045 p.7
`Oracle v. Teleputers
`IPR2021-00078
`
`
`
`C.4
`
`C.5
`
`C.6
`
`C.7
`
`C.8
`
`C.3.8 Sign/Zero Extend/Compute Zero Index.............................................. C-31
`M-Unit Instruction Encodings .......................................................................... C-32
`C.4.1
`Loads and Stores ............................................................................... C-32
`C.4.2
`Line Prefetch ...................................................................................... C-47
`C.4.3 Semaphores....................................................................................... C-49
`C.4.4 Set/Get FR ......................................................................................... C-50
`C.4.5 Speculation and Advanced Load Checks .......................................... C-50
`C.4.6 Cache/Synchronization/RSE/ALAT.................................................... C-52
`C.4.7 GR/AR Moves (M-Unit) ...................................................................... C-53
`C.4.8 Miscellaneous M-Unit Instructions...................................................... C-54
`C.4.9 Memory Management ........................................................................ C-55
`B-Unit Instruction Encodings........................................................................... C-58
`C.5.1 Branches ............................................................................................ C-58
`C.5.2 Nop..................................................................................................... C-63
`C.5.3 Miscellaneous B-Unit Instructions ...................................................... C-63
`F-Unit Instruction Encodings ........................................................................... C-64
`C.6.1 Arithmetic ........................................................................................... C-67
`C.6.2 Parallel Floating-point Select ............................................................. C-68
`C.6.3 Compare and Classify........................................................................ C-68
`C.6.4 Approximation .................................................................................... C-70
`C.6.5 Minimum/Maximum and Parallel Compare ........................................ C-71
`C.6.6 Merge and Logical.............................................................................. C-72
`C.6.7 Conversion ......................................................................................... C-73
`C.6.8 Status Field Manipulation................................................................... C-73
`C.6.9 Miscellaneous F-Unit Instructions ...................................................... C-74
`X-Unit Instruction Encodings........................................................................... C-75
`C.7.1 Miscellaneous X-Unit Instructions ...................................................... C-75
`C.7.2 Move Long Immediate64 .................................................................... C-76
`Immediate Formation ...................................................................................... C-77
`
`viii
`
`IA-64 Application Developer’s Architecture Guide, Rev. 1.0
`
`Oracle-1045 p.8
`Oracle v. Teleputers
`IPR2021-00078
`
`
`
`Figures
`
`3-1
`3-2
`3-3
`3-4
`3-5
`3-6
`3-7
`3-8
`3-9
`3-10
`3-11
`3-12
`3-13
`3-14
`3-15
`3-16
`4-1
`4-2
`4-3
`4-4
`4-5
`5-1
`5-2
`5-3
`5-4
`5-5
`5-6
`
`Application Register Model........................................................................... 3-3
`Frame Marker Format .................................................................................. 3-5
`RSC Format ................................................................................................. 3-7
`BSP Register Format ................................................................................... 3-7
`BSPSTORE Register Format ....................................................................... 3-8
`RNAT Register Format................................................................................. 3-8
`PFS Format .................................................................................................. 3-9
`Epilog Count Register Format ...................................................................... 3-9
`User Mask Format...................................................................................... 3-10
`CPUID Registers 0 and 1 – Vendor Information ........................................ 3-11
`CPUID Register 2 – Processor Serial Number........................................... 3-11
`CPUID Register 3 – Version Information.................................................... 3-11
`CPUID Register 4 – General Features/Capability Bits ............................... 3-12
`Little-endian Loads ..................................................................................... 3-13
`Big-endian Loads ..................................