throbber
Homayoun
`
`Reference 24
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2136, p. 1
`
`

`

`MODERN SYSTEMS AND PRACTICES
`
`TH ·MAS "TEu.LING, MATTHEW AN
`11 MACIEJ l :l{• 1] ,, ,w1cz
`
`·:'\,
`
`F
`
`EW·
`
`Y C. G
`
`N ELL
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2136, p. 2
`
`

`

`High Performance
`Computing
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2136, p. 3
`
`

`

`High Performance
`Computing
`Modern Systems and Practices
`
`Thomas Sterling
`Matthew Anderson
`Maciej Brodowicz
`School of Informatics, Computing, and Engineering
`Indiana University, Bloomington
`
`Foreword by C. Gordon Bell
`
`M<
`
`MORGAN KAUFMANN PUBLISHERS
`
`ELSEVIER
`
`AN IMPRINT OF ELSEVIER
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2136, p. 4
`
`

`

`Morgan Kaufmann is an imprint of Elsevier
`50 Hampshire Street, 5th Floor, Cambridge, MA 02139, United States
`
`Copyright © 2018 Elsevier Inc. All rights reserved.
`
`No part of this publication may be reproduced or transmitted in any form or by any means, electronic or
`mechanical, including photocopying, recording, or any information storage and retrieval system, without
`permission in writing from the publisher. Details on how to seek permission, further information about the
`Publisher's permissions policies and our arrangements with organizations such as the Copyright Clearance
`Center and the Copyright Licensing Agency, can be found at our website: www.elsevier.com/permissions.
`
`This book and the individual contributions contained in it are protected under copyright by the Publisher (other
`than as may be noted herein).
`
`Notices
`Knowledge and best practice in this field are constantly changing. As new research and experience broaden our
`understanding, changes in research methods, professional practices, or medical treatment may become
`necessary.
`
`Practitioners and researchers must always rely on their own experience and knowledge in evaluating and using
`any information, methods, compounds, or experiments described herein. In using such information or methods
`they should be mindful of their own safety and the safety of others, including parties for whom they have a
`professional responsibility.
`
`To the fullest extent of the law, neither the Publisher nor the authors, contributors, or editors, assume any
`liability for any injury and/or damage to persons or property as a matter of products liability, negligence or
`otherwise, or from any use or operation of any methods, products, instructions, or ideas contained in the
`material herein.
`
`Library of Congress Cataloging-in-Publication Data
`A catalog record for this book is available from the Library of Congress
`
`British Library Cataloguing-in-Publication Data
`A catalogue record for this book is available from the British Library
`
`ISBN: 978-0-12-420158-3
`
`For information on all Morgan Kaufmann publications visit
`our website at https://www.elsevier.com/books-and-joumals
`
`[I .,.,..._ Working together
`
`:_..11(1 to grow libraries in
`BookAid d 1


`eve Oplllg COUntrteS
`International
`www.clsc\ ICI.lOlll • www.houL11d.tH g
`
`Publisher: Katey Birtcher
`Acquisition Editor: Steve Merken
`Developmental Editor: Nate McFadden
`Production Project Manager: Punithavathy Govindaradjane
`Designer: Mark Rogers
`
`Typeset by TNQ Books and Journals
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2136, p. 5
`
`

`

`Dedicated to
`
`Dr. Paul C. Messina
`
`Leader, colleague, collaborator, mentor, friend
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2136, p. 6
`
`

`

`Contents
`
`Foreword ............................................................................................................................................. xix
`Preface ................................................................................................................................................ xxi
`Acknowledgments ............................................................................................................................ xx vii
`
`Introduction ........................................................................................... 1
`CHAPTER 1
`1.1 High Performance Computing Disciplines ................................................................. 3
`1.1.1 Definition .......................................................................................................... 3
`1.1.2 Application Programs ....................................................................................... 4
`1.1.3 Performance and Metrics ................................................................................. 4
`1.1.4 High Performance Computing Systems ........................................................... 5
`1.1.5 Supercomputing Problems ............................................................................... 7
`1.1.6 Application Programming ................................................................................ 8
`1.2 Impact of Supercomputing on Science, Society, and Security ................................ 10
`1.2.1 Catalyzing Fraud Detection and Market Data Analytics .............................. 10
`1.2.2 Discovering, Managing, and Distributing Oil and Gas ................................. 10
`1.2.3 Accelerating Innovation in Manufacturing .................................................... 10
`1.2.4 Personalized Medicine and Drug Discovery ................................................. 11
`1.2.5 Predicting Natural Disasters and Understanding Climate Change ................ 12
`1.3 Anatomy of a Supercomputer ................................................................................... 14
`1.4 Computer Performance ............................................................................................. 16
`1.4.1 Performance .................................................................................................... 16
`1.4.2 Peak Performance ........................................................................................... 17
`1.4.3 Sustained Performance ................................................................................... 18
`1 .4.4 Scaling ............................................................................................................ 18
`1.4.5 Performance Degradation ............................................................................... 19
`1.4.6 Performance Improvement ............................................................................. 20
`1.5 A Brief History of Supercomputing ......................................................................... 21
`1.5. I Epoch I-Automated Calculators Through Mechanical Technologies ......... 22
`1.5.2 Epoch II-von Neumann Architecture in Vacuum Tubes ............................. 24
`1.5.3 Epoch III-Instruction-Level Parallelism ...................................................... 29
`1.5.4 Epoch IV-Vector Processing and Integration .............................................. 30
`1.5.5 Epoch V-Single-Instruction Multiple Data Array ....................................... 33
`1.5.6 Epoch VI-Communicating Sequential Processors and Very Large
`Scale Integration ............................................................................................. 34
`1.5.7 Epoch VII-Multicore Petaflops .................................................................... 37
`1.5.8 Neodigital Age and Beyond Moore's Law .................................................... 37
`1.6 This Textbook as a Guide and Tool for the Student... ............................................. 38
`
`vii
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2136, p. 7
`
`

`

`viii
`
`CONTENTS
`
`1.7 Summary and Outcomes of Chapter l ..................................................................... 39
`1.8 Questions and Problems ........................................................................................... 40
`References ......................................................................................................................... 41
`CHAPTER 2 HPC Architecture 1: Systems and Technologies ............................... 43
`2.1 Introduction ............................................................................................................. 44
`2.2 Key Properties of HPC Architecture ..................................................................... .44
`2.2.l Speed ............................................................................................................ 45
`2.2.2 Parallelism .................................................................................................... 45
`2.2.3 Efficiency ...................................................................................................... 46
`2.2.4 Power ............................................................................................................ 46
`2.2.5 Reliability ..................................................................................................... 47
`2.2.6 Programmability ........................................................................................... 48
`2.3 Parallel Architecture Families-Flynn's Taxonomy ............................................. .48
`2.4 Enabling Technology .............................................................................................. 51
`2.4.1 Technology Epochs ...................................................................................... 51
`2.4.2 Roles of Technologies .................................................................................. 55
`2.4.3 Digital Logic ................................................................................................. 55
`2.4.4 Memory Technologies .................................................................................. 58
`2.5 von Neumann Sequential Processors ..................................................................... 62
`2.6 Vector and Pipelining ............................................................................................. 64
`:J..6.1 Pipeline Parallelism ...................................................................................... 65
`2.6.2 Vector Processing ......................................................................................... 68
`2.7 Single-Instruction, Multiple Data Array ................................................................ 69
`2.7.1 Single-Instruction, Multiple Data Architecture ........................................... 69
`2.7.2 Amdahl's Law .............................................................................................. 70
`2.8 Multiprocessors ....................................................................................................... 73
`2.8. l Shared-Memory Multiprocessors ................................................................. 74
`2.8.2 Massively Parallel Processors ...................................................................... 76
`2.8.3 Commodity Clusters ..................................................................................... 77
`2.9 Heterogeneous Computer Structures ...................................................................... 78
`2.10 Summary and Outcomes of Chapter 2 ................................................................... 78
`2.11 Questions and Problems ......................................................................................... 80
`References ......................................................................................................................... 82
`CHAPTER 3 Commodity Clusters ............................................................................ 83
`3. 1 Introduction ............................................................................................................... 84
`3.1.1 Definition of "Commodity Cluster" ............................................................... 84
`3.1.2 Motivation and Justification for Clusters ....................................................... 84
`3.1.3 Cluster Elements ............................................................................................. 85
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2136, p. 8
`
`

`

`CONTENTS
`
`ix
`
`3.1.4 Impact on Top 500 List.. ................................................................................ 86
`3.1.5 Brief History ................................................................................................... 88
`3.1.6 Chapter Guide ................................................................................................. 90
`3.2 Beowulf Cluster Project. ........................................................................................... 91
`3.3 Hardware Architecture .............................................................................................. 93
`3.3.1 TheNode ........................................................................................................ 93
`3.3.2 System Area Networks ................................................................................... 94
`3 .3 .3 Secondary Storage .......................................................................................... 95
`3.3.4 Commercial Systems Summary ..................................................................... 95
`3.4 Programming Interfaces ............................................................................................ 97
`3.4.1 High Performance Computing Programming Languages .............................. 97
`3.4.2 Parallel Programming Modalities .................................................................. 97
`3.5 Software Environment .............................................................................................. 98
`3.5.1 Operating Systems .......................................................................................... 98
`3.5.2 Resource Management ................................................................................... 99
`3.5.3 Debugger. ...................................................................................................... 101
`3.5.4 Performance Profiling ................................................................................... 101
`3.5.5 Visualization ................................................................................................. 101
`3.6 Basic Methods of Use ............................................................................................. 104
`3.6.1 Logging On ................................................................................................... 104
`3.6.2 User Space and Directory System ............................................................... 105
`3.6.3 Package Configuration and Building ........................................................... 110
`3.6.4 Compilers and Compiling ............................................................................ 112
`3.6.5 Running Applications ................................................................................... 113
`3. 7 Summary and Outcomes of Chapter 3 ................................................................... 113
`3.8 Questions and Exercises ......................................................................................... 114
`References ....................................................................................................................... 114
`CHAPTER 4 Benchmarking ................................................................................... us
`4.1 Introduction ........................................................................................................... 115
`4.2 Key Properties of an HPC Benchmark ................................................................ 117
`4.3 Standard HPC Community Benchmarks .............................................................. 120
`4.4 Highly Parallel Computing Unpack .................................................................... 120
`4.5 HPC Challenge Benchmark Suite ........................................................................ 123
`4.6 High Performance Conjugate Gradients .............................................................. 126
`4. 7 NAS Parallel Benchmarks ........................................................................... : ........ 130
`4.8 Graph500 ............................................................................................................... 132
`4.9 Miniapplications as Benchmarks .......................................................................... 135
`4.10 Summary and Outcomes of Chapter 4 ................................................................. 138
`4.11 Exercises ............................................................................................................... 139
`References ....................................................................................................................... 139
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2136, p. 9
`
`

`

`x
`
`CONTENTS
`
`CHAPTER 5 The Essential Resource Management .............................................. 141
`5. 1 Managing Resources ............................................................................................... 142
`5.2 The Essential SLURM ............................................................................................ 146
`5.2.1 Architecture Overview ................................................................................. 147
`5.2.2 Workload Organization ................................................................................ 148
`5.2.3 SLURM Scheduling ..................................................................................... 149
`5.2.4 Summary of Commands ............................................................................... 151
`5.2.5 SLURM Job Scripting .................................................................................. 166
`5.2.6 SLURM Cheat Sheet .................................................................................... 171
`5.3 The Essential Portable Batch System ..................................................................... 172
`5.3.1 Portable Batch System Overview ................................................................ 172
`5.3.2 Portable Batch System Architecture ............................................................ 173
`5.3.3 Summary of PBS Commands ...................................................................... 174
`5.3.4 PBS Job Scripting ......................................................................................... 184
`5.3.5 PBS Cheat Sheet .......................................................................................... 186
`5.4 Summary and Outcomes of Chapter 5 ................................................................... 187
`5.5 Questions and Problems ......................................................................................... 189
`References ....................................................................................................................... 190
`CHAPTER 6 Symmetric Multiprocessor Architecture .......................................... 191
`6.1 Introduction ............................................................................................................. 191
`6.2 Architecture Overview ............................................................................................ 192
`6.3 Amdahl's Law Plus ................................................................................................. 196
`6.4 Processor Core Architecture ................................................................................... 199
`6.4.1 Execution Pipeline ........................................................................................ 200
`6.4.2 Instruction-Level Parallelism ....................................................................... 201
`6.4.3 Branch Prediction ......................................................................................... 201
`6.4.4 Forwarding .................................................................................................... 202
`6.4.5 Reservation Stations ..................................................................................... 202
`6.4.6 Multithreading .............................................................................................. 203
`6.5 Memory Hierarchy .................................................................................................. 204
`6.5. I Data Reuse and Locality .............................................................................. 204
`6.5.2 Memory Hierarchy ....................................................................................... 205
`6.5.3 Memory System Performance ...................................................................... 207
`6.6 PCI Bus ................................................................................................................... 209
`6.7 External 1/0 lnterfaces ............................................................................................ 213
`6.7.1 Network Interface Controllers ...................................................................... 213
`6.7.2 Serial Advanced Technology Attachment ................................................... 215
`6.7.3 JTAG ............................................................................................................. 218
`6.7.4 Universal Serial Bus ..................................................................................... 220
`6.8 Summary and Outcomes of Chapter 6 ................................................................... 222
`6.9 Questions and Exercises ......................................................................................... 223
`References ....................................................................................................................... 224
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2136, p. 10
`
`

`

`CONTENTS
`
`xi
`
`CHAPTER 7 The Essential OpenMP ...................................................................... 22s
`7 .1 Introduction ............................................................................................................. 225
`7 .2 Overview of OpenMP Programming Model .......................................................... 226
`7 .2.1 Thread Parallelism ........................................................................................ 226
`7.2.2 Thread Variables ........................................................................................... 228
`7 .2.3 Runtime Library and Environment Variables .............................................. 228
`7 .3 Parallel Threads and Loops .................................................................................... 231
`7.3.1 Parallel Threads ............................................................................................ 231
`7.3.2 Private ........................................................................................................... 232
`7 .3 .3 Parallel "For" ................................................................................................ 233
`7.3.4 Sections ......................................................................................................... 239
`7 .4 Synchronization ...................................................................................................... 241
`7.4.1 Critical Synchronization Directive ............................................................... 242
`7.4.2 The Master Directive .................................................................................... 242
`7.4.3 The Barrier Directive ................................................................................... 243
`7.4.4 The Single Directive ..................................................................................... 243
`7.5 Reduction ................................................................................................................ 244
`7.6 Summary and Outcomes of Chapter 7 ................................................................... 245
`7. 7 Questions and Problems ......................................................................................... 246
`Reference ........................................................................................................................ 24 7
`CHAPTER 8 The Essential MPI ............................................................................. 249
`8.1 lntroduction ........................................................................................................... 250
`8.2 Message-Passing Interface Standards ................................................................... 251
`8.3 Message-Passing Interface Basics ........................................................................ 253
`8.3.1 mpi.h ........................................................................................................... 253
`8.3.2 MPI_Init. ..................................................................................................... 253
`8.3.3 MPI_Finalize .............................................................................................. 254
`8.3.4 Message-Passing Interface Example-Hello World .................................. 254
`8.4 Communicators ..................................................................................................... 255
`8.4.1 Size ............................................................................................................. 256
`8.4.2 Rank ............................................................................................................ 256
`8.4.3 Example ...................................................................................................... 257
`8.5 Point-to-Point Messages ....................................................................................... 258
`8.5.1 MPI Send .................................................................................................... 259
`8.5.2 Message-Passing Interface Data Types ...................................................... 259
`8.5.3 MPI Recv .................................................................................................... 259
`8.5.4 Example ...................................................................................................... 260
`8.6 Synchronization Collectives ............................................................................... • • 262
`8.6.1 Overview of Collective Calls ..................................................................... 262
`8.6.2 Barrier Synchronization ............................................................................. 263
`8.6.3 Example ...................................................................................................... 264
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2136, p. 11
`
`

`

`xii
`
`CONTENTS
`
`8.7 Communication Collectives ................................................................................. 265
`8.7.1 Collective Data Movement. ........................................................................ 265
`8.7.2 Broadcast .................................................................................................... 268
`8.7.3 Scatter ......................................................................................................... 269
`8.7.4 Gather ......................................................................................................... 271
`8.7.5 Allgather ..................................................................................................... 272
`8.7.6 Reduction Operations ................................................................................. 274
`8.7.7 Alltoall ........................................................................................................ 277
`8.8 Nonblocking Point-to-Point Communication ....................................................... 279
`8.9 User-Defined Data Types ...................................................................................... 281
`8.10 Summary and Outcomes of Chapter 8 ................................................................. 283
`8.11 Exercises ............................................................................................................... 283
`References ....................................................................................................................... 284
`CHAPTER 9 Parallel Algorithms ........................................................................... 285
`9.1 Introduction ........................................................................................................... 285
`9.2 Fork-Join ............................................................................................................. 286
`9.3 Divide and Conquer .............................................................................................. 287
`9.4 Manager-Worker ................................................................................................. 291
`9.5 Embarrassingly Parallel ........................................................................................ 292
`9.6 Halo Exchange ...................................................................................................... 294
`9.6.1 The Advection Equation Using Finite Difference ..................................... 295
`9.6.2 Sparse Matrix Vector Multiplication .......................................................... 297
`9.7 Permutation: Cannon's Algorithm ........................................................................ 301
`9.8 Task Dataflow: Breadth First Search .................................................................... 306
`9.9 Summary and Outcomes of Chapter 9 ................................................................. 310
`9.10 Exercises ............................................................................................................... 311
`References ....................................................................................................................... 311
`CHAPTER 10 Libraries ........................................................................................... 313
`10. 1 Introduction ........................................................................................................ 313
`10.2 Linear Algebra .................................................................................................... 315
`10.2.1 Basic Linear Algebra Subprograms ..................................................... 317
`10.2.2 Linear Algebra Package ....................................................................... 324
`10.2.3 Scalable Linear Algebra Package ........................................................ 326
`10.2.4 GNU Scientific Library ........................................................................ 326
`10.2.5 Supernodal LU ..................................................................................... 326
`10.2.6 Portable Extensible Toolkit for Scientific Computation ...................... 327
`10.2.7 Scalable Library for Eigenvalue Problem Computations .................... 328
`10.2.8 Eigenvalue SoLvers for Petaflop-Applications .................................... 328
`10.2.9 Hypre: Scalable Linear Solvers and Multigrid Methods ..................... 328
`10.2.10 Domain-Specific Languages for Linear Algebra ................................. 329
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2136, p. 12
`
`

`

`CONTENTS
`
`xiii
`
`10.3 Partial Differential Equations ............................................................................. 329
`10.4 Graph Algorithms ............................................................................................... 329
`10.5 Parallel Input/Output .......................................................................................... 330
`10.6 Mesh Decomposition .......................................................................................... 333
`10. 7 Visualization ....................................................................................................... 334
`10.8 Parallelization ..................................................................................................... 334
`10.9 Signal Processing ................................................................................................ 334
`10.10 Performance Monitoring .................................................................................... 341
`10.11 Summary and Outcomes of Chapter 10 ............................................................. 342
`10.12 Exercises ............................................................................................................. 343
`References ..............................................................................

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket