throbber
Homayoun
`
`Reference 23
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2135, p. 1
`
`

`

`Third Edition
`
`High-Performance
`Computer Architecture
`
`,.-. ~~
`
`- ~ ~ - V
`"
`
`Harold s. Stone
`IBM T.J. Watson
`Research Center
`and
`Courant Institute
`New York University
`
`,.4., Addison ... wcsley Publishing Company
`
`Reading, Massachusetts
`Menlo Park, California • New York
`Don Mills, Ontario • Wokingham, England
`Amsterdam • Bonn • Sydney • Singapore
`Tokyo • Madrid • San Juan • Milan • Paris
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2135, p. 2
`
`

`

`This book is in the Addison-Wesley Series in Electrical and· Computer Engineering
`
`Libnry of Congress Cataloging-in-Publication Data
`
`Stone, Harold S.
`High-performance computer architecture/ Harold S. Stone.-3rd
`
`ed.
`
`p. cm.
`Includes bibliographical references and index.
`ISBN 0-201-52688-3
`L Computer architecture. I. Title.
`QA76.9.A73S76 199.3 ,'J t·\: 0
`004.2'2-dc20
`· • ►., _i
`, .· '
`
`92-32243
`CIP
`
`Copyright@ 1993 by Addison-Wesley Publishing Company, Inc.
`All rights reserved. No part of this publication may be reproduced, stored in a
`retrieval system, or transmitted, in any form or by any means, electronic, mechanical,
`photocopying, rttording, or otherwise, without the prior written permission of the
`publisher. Printed in the United States of America.
`1 2 3 4 5 6 7 8 9 10-HA-95949392
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2135, p. 3
`
`

`

`To ]an-colleague and companion
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2135, p. 4
`
`

`

`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2135, p. 5
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2135, p. 5
`
`

`

`p
`reface
`
`..
`
`Teaching computer architecture is an interesting challenge for the instructor
`because the field is in constant flux. What the architect does depends strongly
`on the devices available, and the devices have been changing every two to three
`years, with major breakthroughs once or twice a decade. Within the brief life
`of the first edition of this textbook a whole generation of processor and memory
`chips were first offered for sale, appeared in popular computers, and then
`gradually disappeared from the marketplace as their successors took their places.
`The particular features and strengths of those devices have given way to other
`features in various new combinations and new relative costs. Design practices
`are evolving to exploit the new devices for a new generation of machines. And
`they will evolve again as the next wave of devices appears in the coming years.
`What then should be taught to prepare students for what lies ahead? What
`information win remain important over the technical career of a student, and
`what information will soon become obsolete, of historical interest only? This
`text stresses design ideas embodied in many machines and the techniques for
`evaluating those ideas. The ideas and the evaluation techniques are the principles
`that will survive. The specific implementations of machines that one might
`choose in 1995 2000, or 2005 reflect the basic principles described here as applied
`to the device technology currently prevailing. Effective designs are those that
`use technology cleverly and achieve balanced, efficient structures matched well
`to the class of problems they attack. This text stresses the means to achieve
`balance and efficiency in the context of any device technology.
`
`vii
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2135, p. 6
`
`

`

`viii
`
`Preface
`
`We use a multifaceted approach to teaching the reader how to prepare for
`the future. The major features are the following:
`1. Each topic is a general architectural approach-memory designs, pipeline
`techniques, and a variety of parallel structures.
`2. Within each topic the focus is on fundamenta] bottlenecks-memory band­
`width, processing band-.vidth, communications, anq synchronization-and
`how to overcome these bottlenecks for each topic area.
`3. The materiaJ addresses evaluation techniques to help the reader isolate as­
`pects that are highly efficient from those that are not.
`4. A few machines whose structure is of historical interest are described to
`illustrate how the concepts can be implemented.
`5. Where appropriate, the text draws on examples of real applications and their
`architectural requirements.
`6. Exercises at the end of chapters give the reader an opportunity to sketch
`out designs and perform .evaluation under a variety of technology-oriented
`constraints.
`The exercises are particularly important. They help the reader master the material
`by integrating a number of different ideas, often by working through a paper
`design that must satisfy some unusual set of constraints. In several exercises,
`the student is asked to produce a series of designs1 each reflecting a different
`set of underlying devices. This helps the student gain experience in adapting
`basic techniques to new situations.
`The text is intended for the advanced undergraduate and first-year graduate
`students. It assumes the student has had a course in machine organization so
`that the basic operation of a processor is well understood. Some experience with
`assembly language is helpful, but not essential. Programming in a high-level
`language such as Pascal, however, is necessary to understand the applications
`used as examples. Mathematical background in probability is helpful for Chap­
`ter 2, linear systems or numerical methods for Chapters 4 and 5, and some
`exposure to operating systems will assist understanding of Chapter 7. In no case
`is the material absolutely required because the text contains sufficient discussion
`and references to source material to support the presentation.
`The text purposely avoids detailed descriptions of popular machines because
`in time the machines so described will inevitably be obsolete. In future years,
`a reader of such material may be led to think that the specific details of a
`successful machine represent good design decisions for the future as well as for
`the period in which the design was actually done. A better approach is for the
`individual instructor to discuss one or two current machines while using the
`text, with the notion that current machines can change each year at the discretion
`of the instructor. It is also possible to use the text.without such supplementary
`material because the design exercises provide challenges that represent tech­
`nology through the 1990s.
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2135, p. 7
`
`

`

`P_rcfacc
`
`ix
`
`We jokingly teJl students that the subject matter enjoys a positive benefit
`from the rapid change in technology. The instructor need not create new ex­
`ercises and examinations for each new class. The questions may be the same
`each year, but the answers will be different.
`A number of teaching aids are available with this edition. The exercises in
`Chapter 2 make use of traces of instruction execution for which a floppy disk
`with sample traces is available from the publisher for course adopters. The disk
`is in IBM-compatible format and can be accessed by programs written in a variety
`of programming languages.
`Prior to the publication of this text thorough studies of cache behavior
`required main-frame computers for ana1ysis due to the maSS\Ve amounts of data
`to process. The techniques described in Chapter 2 show how to reduce the
`processing by as much as two orders of magnitude and make possib]e the use
`of a personal computer as the primary analysis tool. The analysis techniques
`were first made widely available in the first edition of this text., and have now
`become standard among computer architects. The exercises for Chapter 2 give
`the student ample opportunity to practice cache analysis on the sample traces
`and to practice evaluating design alternatives.
`An instructor's guide with solutions to selected exercises is also available
`from the publisher to course adopters. Among the solutions in the manual are
`sample solutions to some of the design exercises. The instructor shou]d bear in
`mind that the design exercises can be satisfied by many different designs, and
`that the sample solutions are illustrative of good approaches, but are definitely
`not the only acceptable solutions. What is important is the reasoning used by
`the student to establish that a particular design meets the constraints imposed
`and is both efficient and effective in solving the given design problem.
`Three sets of video-taped lectures provide instructional aid in a different
`form. A set of eight lectures that cover the highlights of the entire text can be
`ordered by writing to Addison-Wesley, Reading, MA 01867., Attn: Engineering
`Editor. A set of three lectures on the topics of multiprocessor cache coherence
`and synchronization is available from the IEEE Computer Society Press, 10662
`Los Vaqueros Circle, Los Alamitos, CA 90720. Another set of three lectures on
`advanced topics in cache behavior and cache analysis is available from the Na­
`tional Technological University, 700 Centre Ave., Ft. Collins, CO 80526, Attn:
`Richard Soderberg. The videotapes focus on central issues, and describe these
`topics visually and orally in a way that cannot be done in writing. Students and
`instructors will find the video tapes very useful for intensive study in short
`courses or self-paced instruction. The video medium is an effective means for
`fast transfer of informationJ and it is a useful supplement to a slower paced
`program of classroom lecture and intensive reading that encourages deeper
`understanding.
`Instructors familiar with the first edition will find new material on program
`behavior models, RISC architectureJ and parallel synchronization. The material
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2135, p. 8
`
`

`

`X
`
`Preface
`
`on program behavior has been introduced because machines have changed so
`quickly in recent years that designers are forced to produce new generations of
`processors without the benefit of traces of workloads for those processors. In
`such cases, the evaluation techniques described in Chapter 2 cannot be brought
`into play. The next best tool is to produce estimates of program behavior that
`can be used as input to design evaluations. We have incorporated some inter­
`esting new developments in program modeling that appeared after the publi­
`cation of the first edition.
`Similarly, RISC architecture and parallel synchronization have been devel­
`oping very quickly in recent years and demanded additional space in the new
`edition. Beyond these topics, small incremental changes in the remaining topics
`have helped bring them up to date and streamlined their presentation.
`The material in the text is structured in a modular fashion,. with each chapter
`reasonably independent of every other chapter. The instructor can put together
`a course by selecting individual chapters and individual sections according to
`the background of the students, the prerequisites available, and the successor
`courses in the curriculum.
`Chapters 2 and 3 form the core material. Cache memories and pipeline
`structures are widely used today, and they are likely to be effective in the
`technologies that will emerge in the next several years. These chapters should
`be taught in all course offerings.
`For courses in which students have a good background in nu merical meth­
`ods, Chapters 4 and 5 show how parallel computer architectures are matched
`to problem domains. Students unfamiliar with the underlying mathematical
`applications will gain an understanding of computational methods in wide use
`from these chapters, and all readers will appreciate how data flow and syn­
`chronization of math ematical actions in an algorithm are directly supported by
`architectural features. The chapters are biased toward supercomputers and large­
`scale computations, but the material is useful as well for general purpose
`computers.
`Chapters 6 and 7 treat multiprocessors, which are more general purpose
`than the machines of Chapters 4 and 5. Multiprocessors were almost exclusively
`research vehicles in the 1970s, and were in commercial use in niche areas in the
`1980s. The 1990s will find a much broader use of multiprocessors as the speed
`of individual processors reaches the limit of metal interconnections. The highest
`sustainable clock rate for metal interconnections is roughly 200 to 250 MHz for
`a typical conductor geometry, although the dock rate can be boosted even higher
`at great expense by reducing the dimensions of all components and conductors.
`Computers in all dasses from microprocessors to high-end machines started the
`1990s within one to two generations of this clock limit. To sustain increases in
`perf?rmance through the decade, the industry must embrace multiprocessing
`m Virtually all computers, or must abandon metal interconnection technology
`for another technology such as optical fiber or optical waveguide technology.
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2135, p. 9
`
`

`

`Prdacc
`
`xi
`
`In this text, we cxplqre the use of multiprocessors and leave the topic of
`optical interconnections for another time and another text. The multiprocessor
`discussion is oriented to where to seek performance improvement by using
`resources efficiently. The interplay of multiple disciplines is central to this dis­
`cussion. Each specialist on a design team should have a broad shallow knowledge
`of the full scope of a design, including hardware, software, architecture, and
`applications, while enjoying a much deeper knowledge of a specialty area. Chap­
`ters 6 and 7 give a broad view of multiprocessors and delve deeply into particular
`topics such as algorithm design and performance models that are relevant to all
`specialties. These chapters are recommended especially for curricula that em­
`phasize systems programming and computer engineering.
`In one semester, it is reasonable to complete selected sections of all chapters,
`or to cover Chapters 2 and 3 and two other chapters in depth. Chapter 1, which
`has no exercises, is to be used as background reading to set the tone of the
`exposition. The text can easily satisfy the needs of a two-quarter or two-semester
`sequence if the instructor chooses to use the full material.
`No matter which portion of the text is covered, working the exercises is
`critical for a thorough appreciation of the material. The design-oriented exercises
`can be rather frustrating at first because there is no clear indication of a correct
`answer. The reader wants to see exercises that can be answered quickly by
`jotting down a simple answer after a small amount of thought. \'\That a pleasure
`to crank through a calculation and find the answer is 17.5. The design exercises
`are nothing like this. In a sense an answer is correct if it meets the constraints
`of the design. The reality is that the answer should be more than correct-it
`must be competitive.
`The point of working such exercises is not the final design, but rather the
`process of arriving at the final design. What alternatives were considered? How
`does the final design overcome basic problems? Did the student consider a
`reasonable set of alternatives or was there a valid approach missed that should
`have been considered? Is the evaluation of the design reasonable? For what
`assumptions concerning technology factors and workload characteristics is the
`given design an efficient one?
`After working through such problems the reader becomes familiar with the
`thought processes of the designer and gains both experience and insight into
`architectural design. Many exercises seem to capture real situations, and this is
`as intended. As in real situations, the reader may discover that there is no good
`solution, and a compromise has to be invented. Or there may be several rea­
`sonable solutions, and the reader has to pick one, possibly on the basis of
`characteristics that are secondary in importance because all solutions available
`have satisfactory primary characteristics. Many exercises have been drawn from
`design problems faced by the author, with constraints updated for the present
`and future.
`The preparation of this text represents the fruits of labor of many parties.
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2135, p. 10
`
`

`

`xii
`
`Preface
`
`The author's students, Tom Puzak, Zarka Cvetanovic, Dominique Thiebaut, and
`John Turek contributed a number of ideas to the text and exercises. They also
`offered helpful comments and criticisms as the project progressed. Kevin Don­
`ovan, David Epstein, and Robert Hinkley produced high-quality solutions to
`the exercises that appear in the instructor's guide. Other reviewers whose com­
`ments are reflected in these pages are WilJiam F. Applebe, Georgia Institute of
`Technology; Richard A. Erdrich, Unisys Corporation; John L Hennessy, Stan­
`ford University; K. C. Murphy, Advanced Micro Devices; PauJ Pederson,
`New
`York University; Richard L. Sites, Digital Equipment Corporation; Henry Levy,
`University of 'v\Tashington; Glen Langdon, University of California at Santa Cruz;
`Peter Hsu, Sun Microcomputers, and Phil Emma, Jeff Lee, K. S. Natarajan,
`Howard Sachar, and Marc Surette, all with IBM. Collective)y and individually,
`their work has aided greatly the process of developing material to make it easily
`accessible to the intended audience. The publication crew at Addison-Wesley
`did a remarkable job in putting the project together. Patsy DuMou1in, Bette
`Aaronson, and Karen Myer demonstrated that they know pipelining in practice
`better than the author does in theory, smoothly flowing the chapters through
`the tedious process of markup, text editing, and page composition in a remark­
`able example of proficiency in high-performance publishing. To Tom Robbins,
`we offer gratitude for support and encouragement in the project from its incep­
`tion to its completion.
`
`Clmppaqua, New York
`
`H. S.S.
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2135, p. 11
`
`

`

`Contents
`
`1
`
`2
`
`Introduction
`1.1 Technology and Architecture
`1.2 But Is It Art?
`1.2.1 The Cost Factor
`1.2.2 Hardware Considerations
`1.3 High-Performance Techniques
`1.3.1 Measuring Costs
`1.3.2 The Role of Applications
`1.3.3 The Impact of VLSI
`1.3.4 The Impact of Digital Communications
`1.3.5 The Effect of Technological Change on Cost
`1.3.6 Algorithms and Architecture
`1.4 Historical References
`
`1
`1
`3
`4
`8
`10
`11
`12
`14
`15
`16
`19
`21
`
`Memory-System Design
`24
`2.1 Exploiting Program Characteristics
`26
`2.2 Cache Memory
`32
`2.2.1
`Basic Cache Structure
`32
`2.2.2 Cache Design
`36
`2.2.3 Cache Analysis: Trace Generation and Trace Length 44
`2. 2.4
`Efficient Cache Analysis
`57
`2.2.5
`Replacement Policies
`70
`Footprints in the Cache
`2.2.6
`76
`
`Xiii
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2135, p. 12
`
`

`

`3
`
`Contents
`
`2.2.7 Writing to the Cache
`2.2.8 Other Cache Metrics
`2.2.9 Modeling System Performance
`2.2.10 Modeling Cache Behavior
`2.3 Virtual Memory
`2.3.1 Virtual-Memory Structure
`2.3.2 Virtual-Memory Mapping
`Improving Program Locality
`2.3.3
`2.3.4 Replacement Algorithms
`2.3.5 Buffering Effects in Virtual-Memory Systems
`Exercises
`
`Pipeline Design Techniques
`3.1 Principles of Pipeline Design
`3.2 Memory Structures in Pipeline Computers
`3.3 Performance of Pipelined Computers
`3.4 Control of Pipeline Stages
`3.4.1 Design of a Multi-Function Pipeline
`3.4.2 The Collision Vector and Pipeline Control
`3.4.3 Maximum Performance Pipelines
`3.4.4 Using Delays to Increase Performance
`3.4.5
`Interlock Elimination
`3.5 Exploiting Pipeline Techniques
`3.5.1 Conditional Branches
`3.5.2
`Internal Forwarding and Deferred Instructions
`3.5.3 Machines with Both Cache and Virtual Memory
`3.5.4 RISC Architectures
`3.5.5 Superscalar Architectures
`3.6 Historical References
`Exercises
`
`4
`
`Characteristics of Numerical Applications
`
`4.1 Classification of Large-Scale Numerical Problems
`4.1.1 Continuum Models
`4.1.2 Particle Models
`4.2 Design Constraints for High-Performance Machines
`4.3 Architectures for the Continuum Model
`4.4 Algorithms for the Continuum Model
`4.4.1 The Cosmic Cube versus the ILLIAC IV
`4.4.2 Data-Flow Requirements
`4.4.3 Parallel Solutions
`4. 4. 4 Recursive Doubling and Cyclic Reduction
`
`84
`87
`90
`95
`102
`103
`107
`115
`118
`125
`129
`
`142
`
`143
`155
`157
`169
`169
`174
`180
`182
`190
`192
`192
`197
`207
`210
`218
`227
`228
`
`235
`
`236
`238
`240
`242
`244
`251
`252
`254
`259
`265
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2135, p. 13
`
`

`

`5
`
`6
`
`Contents
`
`xv
`
`4.5
`
`The Perfect Shuffle
`4.5.1 The Perfect-Shuffle Interconnection Pattern
`4.5.2 Applications of the Perfect Shuffle
`Architectures for the Continuum Model-Which Direction?
`4.6
`Exercises
`
`268
`269
`275
`285
`288
`
`Vector Computers
`5.1 A Generic Vector Processor
`5.1.1 Multiple Memory Modules
`Intermediate Memories
`5.1.2
`5.2 Access Patterns for Numerical Algorithms
`5.2.1 Gaussian Elimination
`5.3 Data-Structuring Techniques for Vector Machines
`5.4 Attached Vector-Processors
`5.5 Sparse-Matrix Techniques
`5.6 The GF-11-A Very High-Speed Vector Processor
`5.7 Final Comments on Vector Computers
`Exercises
`
`Multiprocessors
`6.1 Background
`6.2 Multiprocessor Performance
`6.2.1 The Basic Model-Two Processors with
`Unoverlapped Communications
`6.2.2 Extension to N Processors
`6.2.3 A Stochastic Model
`6.2.4 A Model with Linear Communication Costs
`6.2.5 An Optimistic Model-Fully Overlapped
`Communication
`6.2.6 A Model with Multiple Communication Links
`6.2.7 Multiprocessor Models
`6.3 Multiprocessor Interconnections
`6.3.1 Bus Interconnections
`6.3.2 Ring Interconnections
`6.3.3 Crossbar Interconnections
`6.3.4 Two- and Three-Dimensional Meshes
`6.3.5 The Shuffle-Exchange Interconnection and the
`Combining Switch
`6.3.6 The Butterfly Operation and the Reverse-Binary
`Transformation
`6.3.7 The Combining Network and Fetch-and-Add
`6.3.8 Hypercube Interconnections
`
`292
`
`293
`295
`302
`307
`308
`312
`319
`324
`327
`329
`332
`
`337
`
`338
`342
`
`344
`346
`
`349
`350
`
`352
`353
`
`356
`358
`358
`363
`365
`370
`
`371
`
`373
`378
`384
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2135, p. 14
`
`

`

`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2135, p. 15
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2135, p. 15
`
`

`

`1
`
`Architecture is preeminently the art
`of significant forms in space-that is,
`forms significant of their functions.
`-Claude Bragdon r 1931
`
`Introduction
`
`1.1 Technology and Architecture
`1.t But Is It Art?
`1.3 High-Performance Techniques
`1.4 Historical References
`
`This text is devoted to the study of the architecture of high-speed computer
`systems, with emphasis on design and analysis. We view a computer system
`as being constructed from a variety of functional modules such as processors,
`memories, input/output channels, and switching networks. By architecture, we
`mean the shucture of the modules as they are organized in a computer system.
`The architectural design of a computer system involves selecting various func­
`tional modules such as processors and memories and organizing them into a
`system by designing the interconnections that tie them together. This is anal­
`ogous to the architectural design of buildings, which involves selecting materials
`and fitting the pieces together to form a viable structure.
`
`1.1 Technology and Architecture
`Computer architecture is driven by technology. Every year brings new devices,
`new functions, and new possibilities. An imaginative and effective architecture
`for today could be a klunker for tomorrow, and likewise, a ridiculous proposal
`
`1
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2135, p. 16
`
`

`

`2
`
`Introduction
`
`Chapte,. 1
`
`for today may be ideal for tomorrow. There are no absolute rules that say that
`one architecture is better than another.
`The key to learning about computer architecture is learning how to evaluate
`architecture in the context of the technology available. It is as important to know
`if a computer system makes effective use of processor cycles, memory capacity,
`and Lnput/output bandwidth as it is to know its raw computational speed. The
`objective is to look at both cost and performance, not performance alone, in
`evaluating architectures. Because of changes in technology, relative costs among
`modules as well as absolute costs change dramatically every few years, so the
`best proportion of different types of modules in a cost-effective design changes
`with technology.
`This text takes the approach that it is methodology, not conclusions, that
`needs to be taught. We present a menu of possibilities, some reasonable today
`and some not. We show how to construct high-performance systems by making
`selections from the menus, and we evaluate the systems produced in terms of
`technology that exists at the start of the 1990s. The conclusions reached by these
`evaluations are probably reasonable through the middle of the decade, but in
`no way do we claim that the architectures that look strongest today will be the
`best as we turn to a new millennium.
`The methodology,. however, is timeless. From time to time the computer
`architect needs to construct a new menu of design choices. With that menu and
`the design and evaluation techniques described in this text, the architect should
`be able to produce high-quality systems in any decade for the technology at that
`time.
`Performance analysis should be based on the architecture of the total system.
`Design and analysis of high-performance systems is very complex, however,
`and is best approached by breaking the large system into a hierarchy of functional
`blocks, each with an architecture that can be analyzed in isolation. If any sing1e
`function is very complicated, it too can be further refined into a collection of
`more primitive functions. Processor architecture, for example ✓ involves putting
`together registers, arithmetic units ., and control logic to create processors-the
`computational elements of a computer system.
`An important facet of processor architecture is the design of the instruction
`set for the processor. In years past, there were controversies raging over whether
`instruction sets should be very simple or very complex. The controversies were
`not settled with a single solution; instruction sets continue to evolve with dif­
`ferent underlying philosophies. But as part of the evolution, each different
`approach is influenced by the others, and incorporates advantages of other
`approaches where possible. We illuminate the factors that determine the quality
`of an instruction set, and in any technology an architect can measure those
`factors for a new design to guide the design process.
`Computer architecture is sometimes confused with the design of computer
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2135, p. 17
`
`

`

`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2135, p. 18
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2135, p. 18
`
`

`

`4
`
`Introduction
`
`Chapter 1
`
`aesthetics, for which we have no absolute measures. We have no absolute test
`to conclude whether the work is a masterpiece or a piece of junk. If the art world
`agrees that it is a masterpiece, then it is a mast�rpi:ce.
`.
`.
`.
`.
`Computer architecture, too, has an aesthetic side, but 1t 1s quite different
`from the arts. We can evaluate the quality of an architecture in terms of maximum
`number of results per cycle, program and data capacity, and cost, as well as
`other measures that tend to be important in various contexts. We need never
`debate a question such as, "but is it fast?"
`Architectures can be compared on critical measures when choices must be
`made. The challenge comes because technology gives us new choices each year,
`and the decisions from last year may not hold this year. Not only must the
`architect understand the best decision for today, but the architect must factor
`in the effects of expected changes in technology over the life of a design. There­
`fore, not only do evaluation techniques play a crucial role in individual decisions,
`but by using these techniques over a period of years, the architect gains expe­
`rience in understanding the impact of technological developments on new ar­
`chitectures and is able to judge trends for several years in the future.
`Here are the principal criteria for judging an architecture:
`
`• Performance;
`• Cost; and
`• Maximum program and data size.
`
`There are a dozen or more other criteria, such as weight, power consumption,
`volume, and ease of programming, that may have relatively high significance
`in particular cases, but the three listed here are important in all applications and
`critical in most of them.
`
`1.2.1 The Cost Factor
`
`The cost criterion deserves a bit more explanation because so many people are
`confused about what it means. The cost of a computer system to a user is the
`money that the user pays for the system, namely its price. To the designer, cost
`is not so clearly defined. In most cases, cost is the cost of manufacturing, in­
`cluding a fair amortization of the cost of development and capital tools for
`construction. All too often we see comparisons of architectures that compare
`the parts cost of System A with the purchase price of System B, where System
`A is a novel architecture that is being proposed as an innovation, and System
`B represents a model in commercial production.
`Another fallacious comparison is often made when relating hardware to
`software. In the early years of computing, software was often bundled free of
`charge with hardware, but, as the industry matured, software itself became a
`commodity of value to be sold.
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2135, p. 19
`
`

`

`Section 1.2
`
`But Is It M'l
`
`5
`
`We now discover that what was once a free good now commands a signif­
`icant portion of a computing budget. The trends that people quote are depicted
`in Fig. 1.1, where we see the cost of software steadily rising with inflation and
`complexity, and wit h apparently little relief from advances in software tools.
`Plotted on the same curve is the general trend for hardware in the same period
`of time. Hardware components appear to be diminishing in cost at an unbe­
`lievable rate. If we project these trends forw'ard ten to twenty years, we may
`believe that hardware might be bund1ed with software, given free with the
`purchase of the software that runs on it. But this view is rather naive.
`Software and hardware costs each have two components:
`
`1. A one-time development costj and
`2. A per-unit manufacturing cost.
`
`The actual cost of a product, be it software or hardware, is shown in Fig. 1.2 as
`a function of the volume of production of a product. Note that the cost of the
`first unit is equal to the cost of the development. The cost curve moves upward
`with volume, but the slope tends to diminish with very high volumes because
`of manufacturing experience that tends to reduce per-unit costs over large vol­
`umes of production. The curve in Fig. 1.2 shows accumulated cost of the total
`volume of a product. The price of the product is the cost shown on the curve
`divided by the volume, plus a markup for profit. So price is very sensitive to
`volume when development costs are high.
`When software was essentially free, the development costs were either bun-
`
`9......---------------------,
`
`o Software
`
`■ Hardware (log of normalized cost)
`
`8
`
`7
`
`�
`E 4
`0
`z
`
`3
`
`2
`
`10--�--'----....J"----_,__ _ __._ __ ....._ _ __._ __ ...._ _ _.
`1950
`1955
`1960
`1965
`1970
`1975
`1980 1985
`1990
`
`Fig. 1.1 A naive view of computer-cost trends.
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2135, p. 20
`
`

`

`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2135, p. 21
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2135, p. 21
`
`

`

`Section 1.2
`
`But Is It Art?
`
`1
`
`of the database-management software may be sold. This alone can account for
`a factor-of-ten difference in price.
`Our analysis also shows why in years to come hardware costs will still prove
`to be significant compared to software costs. At issue here is the cost of man�
`ufacturing. Software manufacturing costs are near zero today and can only go
`lower, so that softv,:are pricing in a competitive market mainly reflects the am­
`ortization of development costs.
`Hardware manufacturing costs, while small on a per-chip basis, are many
`times more than software manufacturing costs. It is far less costly today to
`replicate accurate copies of software than it is to replicate hardware. Hardware
`requires assembly and testing to make sure that each copy is a faithful copy of
`the original design. This is far more complex today than the quality assurance
`on a software manufacturing line that simply has to compare each bit of infor­
`mation in software to see if it agrees with the original program.
`Figure ·1.2 suggests a strategy for the development and pricing of VLSI chips,
`hardware, and software. Development costs have to be amortized over the
`volume of units sold. The price of a unit

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket