`
`Reference 31
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2143, p. 1
`
`
`
`Introduction
`
`This text is devoted to the study of the architecture of high-speed computer
`systems, with emphasis on design and analysis. We view a computer system
`as being constructed from a variety of functional modules such as processors,
`memories, input/output channels, and switching networks. By architecture,
`we mean the structure of the modules as they are organized in a computer
`system. The architectural design of a computer system involves selecting
`various functional modules such as processors and memories and organizing
`them into a system by designing the interconnections that tie them together.
`This is analogous to the architectural design of buildings, which involves
`selecting materials and fitting the pieces together to form a viable structure.
`
`1.1 Technology and Architecture
`
`Computer architecture is driven by technology. Every year brings new de(cid:173)
`vices, new functions, and new possibilities. An imaginative and effective
`architecture for today could be a klunker for tomorrow, and likewise, a
`
`1
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2143, p. 2
`
`
`
`2
`
`Introduction
`
`Chap. l
`
`ridiculous proposal for today may be ideal for tomorrow. There are no abso(cid:173)
`lute rules that say that one architecture is better than another.
`The key to learning about computer architecture is learning how to evalu(cid:173)
`ate architecture in the context of the technology available. It is as important
`to know if a computer system makes effective use of processor cycles, memory
`capacity, and input/output bandwidth as it is to know its raw computational
`speed. The objective is to look at both cost and performance, not performance
`alone, in evaluating architectures. Because of changes in technology, relative
`costs among modules as well as absolute costs change dramatically every few
`years, so the best proportion of different types of modules in a cost-effective
`design changes with technology.
`This text takes the approach that it is methodology, not conclusions, that
`needs to be taught. We present a menu of possibilities, some reasonable today
`and some not. We show how to construct high-performance systems by mak(cid:173)
`ing selections from the menus, and we evaluate the systems produced in
`terms of technology that exists in the mid-1980s. The conclusions reached by
`these evaluations are probably reasonable through the end of the decade, but
`in no way do we claim that the architectures that look strongest today will be
`the best in the next decade.
`The methodology, however, is timeless. From time to time the computer
`architect needs to construct a new menu of design choices. With that menu
`and the design and evaluation techniques described in this text, the architect
`should be able to produce high-quality systems in any decade for the tech(cid:173)
`nology at that time.
`Performance analysis should be based on the architecture of the total
`system. Design and analysis of high-performance systems is very complex,
`however, and is best approached by breaking the large system into a hier(cid:173)
`archy of functional blocks, each with an architecture that can be analyzed in
`isolation. If any single function is very complicated, it too can be further
`refined into a collection of more primitive functions. Processor architecture,
`for example, involves putting together registers, arithmetic units, and control
`logic to create processors-the computational elements of a computer
`system.
`An important facet of processor architecture is the design of the instruc(cid:173)
`tion set for the processor, and we shall learn in the course of this text that
`there are controversies raging today concerning whether instruction sets
`should be very simple or very complex. We do not settle this controversy here;
`there cannot be a single answer. But we do illuminate the factors that deter(cid:173)
`mine the answer, and in any technology an architect can measure those
`factors in the course of a new design.
`Computer architecture is sometimes confused with the design of com(cid:173)
`puter hardware. Because computer architecture deals with modules at a
`functional level, not exclusively at a hardware level, computer architecture
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2143, p. 3
`
`
`
`Sec. 1.2
`
`But Is It Art?
`
`3
`
`must encompass more than hardware. We can specify, for example, that a
`processor performs arithmetic and logic functions, and we can be reasonably
`sure that these functions will be built into the hardware and not require
`additional programming. If we specify memory management functions in the
`processor, the actual implementation of those functions may be some mix of
`hardware and software,, with the exact mix depending on performance, avail(cid:173)
`ability of existing hardware or software components, and costs.
`When very-large-scale integration (VLSI) was in its infancy, memory(cid:173)
`management functions were implemented in software, and the processor
`architecture had to support such software by providing only a collection of
`registers for address mapping and protection. With VLSI it becomes possible
`to embed a greater portion of memory management in hardware. Many sys(cid:173)
`tems employ sophisticated algorithms in hardware for performing memory(cid:173)
`management functions once exclusively implemented in software.
`The line between hardware and software becomes somewhat fuzzy when
`last year's software is embedded directly in read-only memory on a memory(cid:173)
`management chip where it is invisibly invoked by the programs being man(cid:173)
`aged. Once such a chip is packaged and is then a "black box" that does
`memory management, the solution becomes a hardware solution. The archi(cid:173)
`tect who uses the chip need not provide additional software for memory
`management. If a chip does most, but not all, memory-management func(cid:173)
`tions internally, then the architect must look into providing the missing
`features by incorporating software modules.
`In retrospect, computer architecture makes systems from components,
`and the components can be hardware, software, or a mixture of both. The
`skill involved in architecture is to select a good collection of components and
`put them together so they work effectively as a total system. Later chapters
`show various examples of architectures, some proven successful and some
`proposals that might succeed.
`
`1.2 But Is It Art?
`An article in the New York Times in January 1985 described a discovery of an
`unsigned painting by de Kooning that raised a few eyebrows among art
`critics. Although it does not bear his signature, there was no doubt that it is
`his work, and it was hung in a gallery for public viewing. The piece is a bench
`from the outhouse of his summer beach house that de Kooning painted ab(cid:173)
`stractly to give the appearance of marble. Is this piece a great work of art by a
`renowned master, or is it just a painted privy seat? The point is that art
`appreciation is based on aesthetics, for which we have no absolute measures.
`We have no absolute test to conclude whether the work is a masterpiece or a
`piece of junk. If the art world agrees that it is a masterpiece, then it is a
`masterpiece.
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2143, p. 4
`
`
`
`4
`
`Introduction
`
`Chap. 1
`
`Computer architecture, too, has an aesthetic side, but it is quite different
`from the arts. We can evaluate the quality of an architecture in terms of
`maximum number of results per cycle, program and data capacity, and cost,
`as well as other measures that tend to be important in various contexts. We
`need never debate a questions such as, "but is it fast?"
`Architectures can be compared on critical measures when choices must
`be made. The challenge comes because technology gives us new choices each
`year, and the decisions from last year may not hold this year. Not only must
`the architect understand the best decision for today, but the architect must
`factor in the effects of expected changes in technology over the life of a design.
`Therefore, not only do evaluation techniques play a crucial role in individual
`decisions, but by using these techniques over a period of years, the architect
`gains experience in understanding the impact of technological developments
`on new architectures and is able to judge trends for several years in the
`future.
`Here are the principal criteria for judging an architecture:
`
`• Performance;
`• Cost; and
`• Maximum program and data size.
`
`There are a dozen or more other criteria, such as weight, power consumption,
`volume, and ease of programming, that may have relatively high significance
`in particular cases, but the three listed here are important in all applications
`and critical in most of them.
`
`1.2.1 The Cost Factor
`
`The cost criterion deserves a bit more explanation because so many people
`are confused about what it means. The cost of a computer system to a user is
`the money that the user pays for the system, namely its price. To the designer,
`cost is not so clearly defined. In most cases, cost is the cost of manufacturing,
`including a fair amortization of the cost of development and capital tools for
`construction. All too often we see comparisons of architectures that compare
`the parts cost of System A with the purchase price of System B, where System
`A is a novel architecture that is being proposed as an innovation, and System
`B represents a model in commercial production.
`Another fallacious comparison is often made when relating hardware to
`software. In the early years of computing, software was often bundled free of
`charge with hardware, but, as the industry matured, software itself became a
`commodity of value to be sold.
`We now discover that what was once a free good now commands a signifi(cid:173)
`cant portion of a computing budget. The trends that people quote are de(cid:173)
`picted in Fig. 1.1, where we see the cost of software steadily rising with
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2143, p. 5
`
`
`
`Sec. 1.2
`
`But Is It Art?
`
`5
`
`o Software
`
`• Hardware (log of normalized cost)
`
`8
`
`7
`
`(;)
`8 6
`
`-0
`Q)
`.!::! 5
`(ij
`E
`0 4
`z
`
`3
`
`2
`
`Fig. 1.1 A naive view of computer-cost trends.
`
`inflation and complexity, and with apparently little relief from advances in
`software tools. Plotted on the same curve is the general trend for hardware in
`the same period of time. Hardware components appear to be diminishing in
`cost at an unbelievable rate. If we project these trends forward ten to twenty
`years, we may believe that hardware might be bundled with software, given
`free with the purchase of the software that runs on it. But this view is rather
`naive.
`Software and hardware costs each have two components:
`l. A one-time development cost; and
`2. A per-unit manufacturing cost.
`
`The actual cost of a product, be it software or hardware, is shown in Fig. 1.2 as
`a function of the volume of production of a product. Note that the cost of the
`first unit is equal to the cost of the development. The cost curve moves
`upward with volume, but the slope tends to diminish with very high volumes
`because of manufacturing experience that tends to reduce per/unit costs over
`large volumes of production. The curve in Fig. 1.2 shows accumulated cost of
`the total volume of a product. The price of the product is the cost shown on the
`curve divided by the volume, plus a markup for profit. So price is very
`sensitive to volume when development costs are high.
`When software was essentially free, the development costs were either
`bundled with the hardware development costs, borne by the users who
`developed the bulk of their own software, or simply not accounted or recov-
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2143, p. 6
`
`
`
`6
`
`Introduction
`
`Chap. 1
`
`-~Ci)
`
`240.---~~~~~~~~~~~~~~~~~-=...a
`220
`200
`(/)
`(;)
`0
`180
`()
`g>
`160
`-§-g 140
`~ l}l 120
`::J ::J
`~& 100
`-0~ 80
`c:
`60
`<1l
`0
`(:!
`40
`20
`O'--~'---~'---~'----J'----J'------'~---'~---'~--'-~--'
`50,000
`100,000
`0
`Volume
`
`Fig. 1.2 Cumulative production costs versus volume.
`
`ered by the software producers. Since hardware manufacturing costs were
`very high compared to today, software manufacturing costs were small rela(cid:173)
`tive to hardware manufacturing costs. As long as development costs did not
`have to be covered through the direct sale of software, it was reasonable to
`give away the software.
`Eventually, software development costs became significant and could no
`longer be ignored. But software replication is still essentially free. We can
`easily draw a parallel with some VLSI chips in mass production (for example,
`a complex microprocessor chip with a few hundred thousand transistors).
`The chip-development cost may be about the same order of magnitude as the
`cost of the development of the operating system or of a database-manage(cid:173)
`ment applications package that runs on the chip.
`In very high volume, the manufacturing cost per chip is the same order of
`magnitude as the manufacturing cost of the software. Yet we see the chip sold
`at a price that may be as little as one-tenth the price of the software, and the
`computer system that contains the chip may be priced at ten times the cost of
`the software. The price of the chip, software, and computer system seem to be
`unrelated to manufacturing costs. In part, the price is determined by volume
`of sales, because the price must recover the development costs. Several
`million copies of the chip may be sold, but perhaps only a few hundred
`thousand copies of the database-management software may be sold. This
`alone can account for a factor-of-ten difference in price.
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2143, p. 7
`
`
`
`Sec. 1.2
`
`But Is It Art?
`
`7
`
`Our analysis also shows why in years to come hardware costs will still
`prove to be significant compared to software costs. At issue here is the cost of
`manufacturing. Software manufacturing costs are near zero today and can
`only go lower, so that software pricing in a competitive market mainly re(cid:173)
`flects the amortization of development costs.
`Hardware manufacturing costs, while small on a per-chip basis, are many
`times more than software manufacturing costs. It is far less costly today to
`replicate accurate copies of software than it is to replicate hardware. Hard(cid:173)
`ware requires assembly and testing to make sure that each copy is a faithful
`copy of the original design. This is far more complex today than the quality
`assurance on a software manufacturing line that simply has to compare each
`bit of information in software to see if it agrees with the original program.
`We see that hardware pricing carries the burden of per-unit manu(cid:173)
`facturing costs together with development costs, whereas software pricing
`reflects development costs to a much greater extent. When computers fit on a
`single chip, their prices should bear some similarity with software prices.
`Indeed, we see hand calculators sold for roughly the same price as the most
`popular simple software tools. But computers that contain hundreds or thou(cid:173)
`sands of individual components are far more complex to reproduce than any
`software package. At the very least, the hardware manufacturer has to test
`the chips and systems to reject the failures, and the corresponding process in
`software manufacturing is negligible because copying software is low cost,
`reliable, and inexpensively verified. In a competitive market, it is very un(cid:173)
`likely that computers of moderate or high performance will be given away to
`purchasers of the accompanying software.
`
`1.2.2 Hardware Considerations
`
`Another fallacious argument about new designs for the future concerns the
`lavish use of hardware components in a system. The architects state con(cid:173)
`vincingly that with current trends in force, the cost of hardware will be
`negligible, so that we can afford to build systems of much greater hardware
`complexity in the future than we can today. Clearly, there is truth in this
`argument to the extent that future systems will surely be more powerful and
`complex at equal cost to today's systems. But the argument must be used
`with care because it does not excuse gross waste of hardware.
`In the future, given System A, with 100 times the logic as present systems,
`and System B, whose performance is essentially identical to A's but has only
`10 or 20 times the logic as present systems, System A will be at a serious
`competitive disadvantage. For a few hundred or a few thousand copies of
`System A sold, System A may be priced competitively with System B. For
`higher volumes of production, however, the inefficiency of the architecture of
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2143, p. 8
`
`
`
`8
`
`Introduction
`
`Chap. 1
`
`System A will force its price higher than System B's for equal system value.
`Of course, this presumes that both System A and System B are built from
`components of the same generation of technology. If A's chips are ten times as
`dense as B's chips and therefore 10 times less costly per device, then the
`argument changes, and device technology, not architecture is the determin(cid:173)
`ing factor in the price of the system.
`Throughout this text we explore the study of architecture by considering
`innovations of the future that depend on low-cost components. But we shall
`always heed the efficiency of the architectures we examine to be sure that we
`are using our building blocks well.
`Consider, for example, a multiprocessor system in which there exists no
`shared memory, and suppose that we want to run a parallel program in which
`each processor executes the same program. Obviously, we can load identical
`copies of the program in all processors. When the program is small or the
`number of processors is rather modest, the memory consumed by the multi(cid:173)
`ple copies may be quite tolerable.
`But what if the program is a megabyte in size, and what if we plan to use
`1000 processors in our system? Then the copies of the program account for a
`gigabyte of storage, which need not be present if there were some way to
`share one copy of code across all processors.
`If System A uses multiple copies of programs, and System B, through a
`clever design, achieves nearly equal performance with a single copy, then the
`extra gigabyte of memory required by System A could well make System A
`totally uncompetitive with System B, unless the cost of storage becomes so
`insignificant that a gigabyte of memory accounts for a paltry fraction of the
`cost of a system. System A's architect hopes that the cost per bit of memory
`will tumble in the future, but System A requires 1010 more bits, and this is an
`enormous multiplier. If current historical trends continue, a drop in cost per
`bit to offset an inefficiency of this magnitude would probably take twenty to
`thirty years.
`In the example just presented, the architect of System A has to be aware
`of other approaches that could overcome a basic flaw in System A for the
`particular application. System A might be totally effective for other applica(cid:173)
`tions in which each processor requires a different program. But in the given
`context, System B has a tremendous, probably insurmountable advantage.
`The architect should measure the quality of the architecture across a
`number of applications that characterize how an architecture is to be used.
`The effectiveness may vary considerably from application to application, and
`such measurements should reveal where the architecture is truly beneficial to
`the user and where other approaches are superior.
`A computer architecture might well have some minor but costly inherent
`flaws that escape the scrutiny of its designer. A different designer who can
`build essentially the same architecture with those flaws repaired can produce
`r
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2143, p. 9
`
`
`
`Sec. 1.2
`
`But Is It Art?
`
`9
`
`a more effective, and therefore more competitive, machine. Architects cannot
`hide inefficiency by arguing that hardware costs nothing.
`As a simple example of this rule, consider an architecture with a rather
`large number of processors, such as 16,000, and assume that the processors
`are to be used in an application where the speedup attributed to N processors
`is proportional to log2 N. (As astonishing as this sounds, such proposals have
`been made.) The 16,000 processors yield only a speedup of 14x for some
`constant x. The architect argues cogently that the 16,000 processors are so
`inexpensive that we can ignore their cost. The important fact is that the
`application runs 14x times faster than it runs on a single processor, and the
`speed increase is worth the small extra cost for the processors.
`In this competitive world, the gross inefficiency of the architecture can(cid:173)
`not escape notice for long. Soon there appears a System B to compete with
`this System A. System B's architecture is identical to A's in this case, except
`that it is a rather scaled-down version. In fact, System B has only 128 pro(cid:173)
`cessors, not 16,000, so it runs only 7x times faster than a single processor.
`System A is over 100 times more complex than System B, and yet System
`A runs only twice as fast. The cost of hardware would have to be near zero for
`System B to fail to compete with System A. For the next decade at least, it
`appears to be unjustifiable on a cost basis to double performance by repli(cid:173)
`cating hardware one-hundred-fold.
`The arguments in this section have taught us:
`
`• We can evaluate architectures by their cost and performance;
`• The effectiveness of an architecture must be measured on workloads for
`which the architecture is intended; and
`• An architecture that is inefficient because of wasted resources will com(cid:173)
`pete poorly against a simpler but more efficient architecture.
`
`If computer architecture were purely an art, and aesthetics alone deter(cid:173)
`mined the quality of an architectural design, we would not have a basis for
`technical advances. Computer architecture combines the art of design with
`insight derived from careful analysis to create new forms of computer sys(cid:173)
`tems that yield ever greater service to their users.
`
`1.3 High-Performance Techniques
`Of the criteria discussed in the preceding section, this text emphasizes high
`performance. Our objective is to describe many different ways to improve
`system performance and give some additional information for evaluating
`those techniques. The menu of available techniques is rather extensive today,
`and each new generation of technology brings new ideas to the fore.
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2143, p. 10
`
`
`
`10
`
`Introduction
`
`Chap. l
`
`This text covers the highlights of the existing menu of design choices, but
`is by no means complete as of its publication date. Therefore we explore the
`design methodology-identify the critical design problems, generate solu(cid:173)
`tions to these problems, evaluate, and select the best or most reasonable
`solution.
`Although we emphasize performance, a thorough evaluation should
`consider all the criteria for comparing architectures. We simply place a
`greater weight on performance. For the majority of the design space, cost and
`performance are treated together as a single parameter, the cost-performance
`ratio. The ratio is appropriate because it stays constant as you increase per(cid:173)
`formance and cost by equal factors.
`We would like to believe that users are willing to pay ten percent more for
`a machine that is ten percent faster, that is, a machine whose cost(cid:173)
`performance ratio is equal to their current one. If a machine yields 20 percent
`higher performance for ten percent higher cost, the users may see a genuine
`benefit in moving to the new machine, and indeed it has a lower cost(cid:173)
`performance ratio reflecting a lower cost per computation. In most cases,
`users would not be interested in a machine that yields only five percent
`higher performance at ten percent higher cost because their cost per
`computation goes up, not down, if they move to the new machine.
`The exceptional cases occur when the present facilities are saturated, and
`the user absolutely must have greater capacity. Now the cost-performance
`ratio does not tell the whole story because the total benefit of greater capacity
`for the user may be much greater than the cost to achieve that capacity. The
`fact that the user is actually paying a higher cost per computation to obtain
`that capacity is incidental to the value in being able to do computations that
`could not be done before. However, if the user has a choice in how to obtain
`the necessary capacity, the user may still pick a solution based on the lowest
`cost-performance ratio, even though all possible solutions have higher ratios
`than the ratio for the user's current system.
`
`1.3.1 Measuring Costs
`
`We have been careful to give examples based on small changes in per(cid:173)
`formance and cost. The cost-performance ratio is a good indicator of relative
`quality for small changes, but its usefulness breaks down when costs and
`performance vary by large factors.
`It would be very deceptive, for example, to measure the cost-performance
`ratio of a small computer, such as an 8-bit video-game system, and to com(cid:173)
`pare this to a much more powerful system, such as a workstation for
`computer-aided design. Although both systems are used to display images
`and interact with the images in real time, the video game probably has a
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2143, p. 11
`
`
`
`Sec.1.3
`
`High-Performance Techniques
`
`11
`
`much better cost-performance ratio than the workstation, assuming we can
`find some way of measuring relative performance. The problem is that the
`relative costs of the systems vary by a factor of up to 1000 to 1, and similarly,
`the relative performance factor is very large, although probably not as large
`as the relative cost.
`The video game cannot do the same job as the workstation. Moreover, if
`you put enough copies of the video game together to have a performance
`equal to the workstation, the cost would be less than the workstation cost, but
`the collection of video games still could not do the same job. So just to be sure
`that comparisons based on cost-performance ratios are valid, one should be
`careful to make the comparisons between computers that are similar in func(cid:173)
`tion and relatively close in performance.
`This discussion points to two important ways to make architectural ad(cid:173)
`vances:
`1. Make small changes in cost and performance that yield lower cost(cid:173)
`performance ratios; and
`2. Boost absolute performance to make new computations feasible at
`reasonable cost.
`By "small" changes, we mean roughly a factor of 10 or less. Changes
`larger than this are surely welcome, but the cost-performance ratio cannot be
`trusted as a measure to evaluate the change. For the second point, the
`cost-performance ratio can actually increase, provided that the additional
`cost can be absorbed by the user, because the benefit of the greater capacity
`exceeds the cost to attain the capacity. We use both of these criteria through(cid:173)
`out the text as informal ways to evaluate ideas.
`Because absolute cost measured in physical currency is changing every
`year, it is more useful to define cost in terms of other parameters that influ(cid:173)
`ence cost. These parameters include the physical parameters, such as pin
`count, chip area, chip count, board area, and power consumption, derived
`from an implementation of an architecture. The parameters also include
`factors associated with development, such as elapsed design time, amount of
`associated software to be written, and size of development team required.
`This text cannot easily account for all the factors that affect cost, but it
`can isolate the most important ones, especially when comparing two closely
`related architectures whose differences are limited to a few critical design
`choices. The intent is to focus on the differences and discuss the ways they
`affect the cost factors. Each different approach has its own advantages and
`disadvantages, and they in turn affect the cost of the approach. We cannot
`give absolute costs, but we can show the influence of the design decision on
`the cost parameters. The reader can then apply the prevailing cost functions
`to complete the evaluation.
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2143, p. 12
`
`
`
`12
`
`Introduction
`
`Chap. 1
`
`1.3.2 The Role of Applications
`
`With dramatic changes in technology ahead, how do we approach the prob(cid:173)
`lem of high-performance architecture design? For example, the new tech(cid:173)
`nology makes feasible massive parallelism. How much additional effort
`should be invested in increasing the performance of a single processor before
`we seek higher levels of performance by replicating processors? There is no
`simple answer to these questions. We need a combination of solutions, and
`what we choose almost certainly will be application dependent.
`The role of applications is critical in the high-performance arena because
`costs tend to be very high to wring the greatest possible throughput from an
`architecture. Inefficiency is especially costly in this context because ineffi(cid:173)
`ciency adds greatly to already high cost, while contributing less than its fair
`share to performance. If the application area is heavily biased to some
`well-identified workload, then it becomes possible to design the architecture
`for that type of workload. The result is that the architecture can be stripped
`clean of irrelevant functions that might otherwise be necessary for general
`purposes. It can then be heavily armed with functions pertinent to the partic(cid:173)
`ular workload.
`The objective then is to reduce inefficiency by making sure that all the
`functional components of the architecture contribute effectively to achieving
`high performance. If it were possible to build a general-purpose machine that
`would be equally effective for all high-performance applications, the industry
`would do so. And we cannot rule out this possibility in years to come. How(cid:173)
`ever, for the next decade, specific problem areas are so demanding of
`computational cycles that it is fruitful to design architectures specialized for
`these problem areas.
`Among the important problem areas that have evolved are:
`
`• Highly structured numeric computations-weather modeling, fluid flows,
`finite-element analysis;
`• Unstructured numeric computations-Monte Carlo simulations, sparse
`matrix problems;
`• Real-time multifaceted problems-speech recognition, image processing,
`and computer vision;
`• Large-memory and input/output-intensive problems-database systems,
`transaction systems;
`• Graphics and design systems-computer-aided design; and
`• Artificial intelligence-knowledge-base-oriented systems, inferencing sys(cid:173)
`tems.
`
`Obviously, the numerical areas call for sophisticated floating-point pro(cid:173)
`cessors in the architecture, and the more demanding applications may
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2143, p. 13
`
`
`
`Sec. 1.3
`
`High-Performance Techniques
`
`13
`
`require hundreds of such processors. The graphics systems may be more
`strongly oriented to fixed-point computations to provide the mathematical
`support required for windowing and perspective viewing. Floating point,
`however, plays an important role in some graphics applications, such as
`those that require smooth-curve rendering and ray-tracing calculations. The
`artificial-intelligence systems may require very little arithmetic capability,
`but they are usually heavily endowed with memory.
`A high-performance architecture that meets the needs of all the areas
`mentioned must carry a burden of inefficiency for each problem area because
`a substantial portion of its capability would not be useful for individual
`applications. If the inefficiency is high enough for any one application area,
`then an efficient specialized machine for that area is more attractive than a
`general-purpose machine because the specialized machine should cost less to
`manufacture.
`The cost advantage depends on having a large enough market for the
`specialized machine so that the cost of development can be spread across
`many copies produced. The advantage is lost if only a few copies are sold.
`Consequently, even the specialized high-performance machines should be as
`general purpose as possible within their problem domains so that the fixed
`costs can be amortized over as large a base as