throbber
UNITED STATES PATENT AND TRADEMARK OFFICE
`______________________
`
`BEFORE THE PATENT TRIAL AND APPEAL BOARD
`______________________
`MICROSOFT CORPORATION,
`Petitioner,
`v.
`DIRECTSTREAM, LLC,
`Patent Owner.
`_______________________
`IPR2018-01594 (Patent 6,434,687 B1)
`IPR2018-01599 (Patent 6,076,152)
`IPR2018-01600 (Patent 6,247,110 B1)
`IPR2018-01601 (Patent 7,225,324 B2)
`IPR2018-01602 (Patent 7,225,324 B2)
`IPR2018-01603 (Patent 7,225,324 B2)
`IPR2018-01604 (Patent 7,421,524 B2)
`IPR2018-01605 (Patent 7,620,800 B2)
`IPR2018-01606 (Patent 7,620,800 B2)
`IPR2018-01607 (Patent 7,620,800 B2)
`__________________________
`
`DECLARATION OF JON HUPPENTHAL
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2101, p. 1
`
`

`

`TABLE OF CONTENTS
`
`I. INTRODUCTION .................................................................................................... 1
`
`II. QUALIFICATIONS ............................................................................................... 1
`
`III. STATE OF THE ART ........................................................................................... 3
`
`A. Cray Research and Cray Computer Corporation ........................................... 3
`
`B. SRC Computers ............................................................................................ 13
`
`C. SRC-6 Hi-Bar Crossbar Switch ................................................................. 16
`
`D. SRC-6 Processor .......................................................................................... 22
`
`E. SRC-6 Common Memory ............................................................................ 30
`
`F. SRC-6 Reconfigurable Processor ................................................................. 30
`
`G. MAP Development ....................................................................................... 35
`
`H. SRC Architecture and Focus Change .......................................................... 37
`
`I. Software Development .................................................................................. 48
`
`J. Applications ................................................................................................... 51
`
`K. Summary....................................................................................................... 54
`
`
`
`
`
`
`
`
`
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2101, p. 2
`
`

`

`I. INTRODUCTION
`1. I am an inventor of U.S. Patents 6,076,152, 6,247,110, 6,434,687, 7,225,324,
`
`7,421,524, and 7,620,800 and one of the original employees of SRC Computers.
`
`2. Everything in this declaration is based on my personal knowledge and
`
`professional judgment. Several of the documents referenced in Exhibit B and
`
`attached to this declaration are based on my personal knowledge from awareness of
`
`them at the time of their creation, documents I personally created, or business records
`
`of SRC Computers/DirectStream, LLC, which I am a custodian of. Furthermore, all
`
`photographs in this document were taken by myself at the time with the exception of
`
`the SRC-6e, which was taken from the SRC Computers/DirectStream’s photo
`
`archive.
`
`3. If called as a witness during this matter, I am prepared to testify competently
`
`about them.
`
`II. QUALIFICATIONS
`4. My curriculum vitae is provided as Exhibit A. Relevant highlights are
`
`summarized below.
`
`5. I received a Bachelor’s Degree in Electrical Engineering from Purdue
`
`University in West Lafayette, Indiana in 1979, and am a named inventor on 27
`
`United States Patents, as well as numerous foreign counterparts. These patents cover
`
`methods and apparatus for wafer level testing of semiconductors, high-speed
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2101, p. 3
`
`

`

`computer interconnect technologies, FPGA-based reconfigurable processor designs,
`
`heterogeneous computer system designs, optimal application programming
`
`techniques for reconfigurable processors, and methods for the use of heterogeneous
`
`computer systems.
`
`6. I also held a Top Secret SCI security clearance with SI TK endorsements. To
`
`achieve these clearances, I was subjected to extended background investigations by
`
`various U.S. Governmental Intelligence Services, which included polygraph testing.
`
`7. I am currently the Executive Vice-President and Chief Technology Officer for
`
`Systems at DirectStream, LLC. In this role I am responsible for, and actively
`
`participate in, the design and manufacture of all DirectStream FPGA-based computer
`
`systems.
`
`8. In 1996, I was asked by Seymour Cray, the father of supercomputing, to be one
`
`of the founders of SRC Computers LLC. I served as the Vice-President of Hardware
`
`Development for the company through December of 2003. In January 2004, I
`
`became the company’s Chief Executive Officer and Chief Technology Officer
`
`serving in that position until the company was acquired by DirectStream, LLC in
`
`February of 2016. While at SRC Computers I invented, developed and patented the
`
`FPGA-based MAP® processor, as well as the system architecture incorporating it and
`
`methods for its optimal use. I was also responsible for overseeing the entire
`
`intellectual property program at SRC.
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2101, p. 4
`
`

`

`9. Prior to SRC Computers, I was Manager of Electrical Design initially for Cray
`
`Research in 1988 and then Cray Computer Corporation after its separation in 1999. I
`
`stayed in this role until March of 1995 and was responsible for the electrical design
`
`and testing at the wafer, module and system level of the Cray-3, Cray-4 and Cray-5
`
`Gallium Arsenide-based supercomputers.
`
`10. I have been a member of the Advisory Boards for the School of Electrical and
`
`Computer Engineering of the University of Colorado, Colorado Technical University
`
`and the Catholic University of America. At the 2010 World Conference in Computer
`
`Science, I gave the keynote address and received the Outstanding Achievement
`
`Award in recognition of my “leadership and outstanding research to the field of
`
`Heterogeneous Systems”.
`
` STATE OF THE ART
`III.
`A. Cray Research and Cray Computer Corporation
`11. In order to understand the DirectStream patents under discussion, it is
`
`imperative to understand the high-performance computing (HPC) field for which
`
`they were developed. For example, as shown in patent number 6,607,152 col. 1 lines
`
`35-49; patent number 7,421,524 col. 1 line 21, col. 1 line 28 – col. 2 line 12; patent
`
`number 7,620,800 col. 1 line 39-61; patent number 6,434,687 col. 1 line 20, col. 1
`
`line 52-63. Such an understanding unquestionably starts with Seymour Cray and the
`
`Cray family of supercomputer systems.
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2101, p. 5
`
`

`

`12. My first involvement with Seymour and HPC came in November of 1988.
`
`Earlier that year, Seymour had moved the portion of Cray Research responsible for
`
`the Cray-3 system, along with himself, to Colorado Springs, Colorado. I had spent
`
`the previous 8 years in Colorado Springs designing test systems and simulators at
`
`TRW for use in the manufacture of cryptographic systems for the National Security
`
`Agency (NSA). As luck would have it, Cray Research had a serious issue trying to
`
`manufacture and test the semiconductor wafers used in the Cray-3 and I was
`
`recruited to solve this problem. My experiences here would heavily influence the
`
`design choices that would be made at SRC Computers.
`
`13. The Cray-3 was a very typical Seymour Cray architecture building on the
`
`Cray-1 and Cray-2 with a relatively small number, 2 to 16, very high-performance
`
`processors connected to multiple shared memory banks through a crossbar switch. A
`
`more detailed description of these systems can be found in the Cray Research and
`
`Cray Computer Corporation documentation.1,2,3
`
`14. Unlike all other computers at the time, the Cray-3 used Gallium Arsenide
`
`(GaAs) instead of silicon to make its semiconductors. The reason for this was that
`
`GaAs had significantly higher electron mobility than silicon so we were able to
`
`operate theses circuits much faster than silicon, which would yield a big performance
`
`advantage over all other systems.
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2101, p. 6
`
`

`

`15. The drawback was that at that time GaAs chips could not be fabricated with as
`
`small of a feature size as silicon so the number of gates that could fit in a given area
`
`of a semiconductor wafer was smaller. This meant that we could not fit as much
`
`circuitry on a GaAs chip as the competitors, such as Intel using silicon could.
`
`However, the performance gains were still significant enough that the decision was
`
`made to stay with GaAs and build the Cray-3 in such a way that we could package
`
`many bare GaAs ASIC die in a very dense fashion to minimize the size of the
`
`system.
`
`16. This meant that a single Cray-3 processor would be built not using a single or
`
`small number of microprocessor chips, but rather a set of four, 4"x4"x1/4" modules
`
`containing a total of 4096 GaAs ASICs in a very unique and complex 3D stacking
`
`process3.
`
`
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2101, p. 7
`
`

`

`
`4"x4"x1/4" Cray-3 Module with Bare GaAs Logic Board and Memory Boards
`
`17. A complete 10 module processor assembly would consist of eight modules
`
`making up two processor and two modules for I/O. These were then interconnected
`
`to the common memory banks using thousands of twisted pair wires making up a
`
`wire mat. The term Memory Bank was well understood in the HPC industry,
`
`including myself, as a group of interconnected memory devices that are accessible by
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2101, p. 8
`
`

`

`a processor or other similar device through a single access port and was commonly
`
`used in Cray Research documentation.1,2,3
`
`
`
`Final Generation Dual Processor Cray-3 Assembly with Wire Mat
`
`18. The total volume consumed by a single Cray-3 processor was about the same
`
`
`
`size as the much lower performance Intel slot 2 Xeon processor that would come
`
`along a decade later. This same stacked module assembly process detailed in the
`
`Cray-3 documentation3 was also used to create the Common Memory Banks and the
`
`Crossbar Switch in the Cray-3. The Crossbar Switch in the Cray-3 was actually
`
`distributed among all the modules and was not a discrete assembly.
`
`
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2101, p. 9
`
`

`

`32 Module 32 Bank Cray-3 Memory Bank Assembly and Wire Mat
`
`19. In total, the two processors, accompanying memory banks and crossbar
`
`
`
`switch, called an Octant, were made up of 18,432 GaAs die level ASICs along with
`
`18,432 memory die all mounted directly on circuit boards. While many of the die had
`
`identical functionality, there were still 480 unique GaAs ASIC designs. Due to the
`
`complex module assembly, and since there was no packaging to be used on any of
`
`these parts, it was imperative that the good and bad die be identified while the GaAs
`
`wafers were still in wafer form.
`
`20. At the time I joined Cray Research in 1988, the company had no effective way
`
`to test these wafers at full operating speed. As a result, completed modules had to be
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2101, p. 10
`
`

`

`repeatedly disassembled and reassembled to try to find working ASICs. With 40,000
`
`opportunities for a bad part to be in the mix, the Cray-3 was making slow forward
`
`progress. My assignment was to develop a way to functionally test the ASICs on the
`
`wafers at 500MHz. At the time there were no commercial systems available to do
`
`this. The first step was to develop our own probe cards just to make the high number
`
`of contacts with the wafer in a way that would support our fast data rates. This design
`
`would ultimately lead to my first patent.
`
`500 MHz Cray-3 Wafer Probe Card
`
`
`
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2101, p. 11
`
`

`

`21. With that accomplished, we then had to find a way to generate all the high-
`
`speed input data patterns that we needed, as well as a way to capture the high-speed
`
`output signals. This would involve both very expensive commercial equipment and
`
`circuits of our own designs. Once the whole test system was functioning, we then had
`
`to develop the 480 test programs to run on it to test each ASIC type. After several
`
`years of effort, we were then able to successfully screen the good and bad ASICs at
`
`the wafer level.
`
`22. Now that we could accurately evaluate the quality of the GaAs wafers we
`
`uncovered the next major issue. At that time there were only two commercial
`
`foundries, Gigabit Logic and Fujitsu, that would produce GaAs wafers to order.
`
`Unfortunately, what we found was that the process consistency both lot-to-lot and
`
`between vendors was horrible. As a result, we were unable to have a constant supply
`
`of good ASICs that would work with each other. This led to the company making the
`
`decision to build our own GaAs semiconductor processing foundry so that we could
`
`be assured of a consistent supply of ASICs. Once this very expensive effort was
`
`completed and operational, we were finally able to build functional Cray3 systems.
`
`23. In 1989 the Colorado Springs operation and Seymour Cray broke off from
`
`Cray Research to become Cray Computer Corporation (CCC) and I became Manager
`
`of Electrical Design for the new company. In this role I was responsible for all
`
`electrical design aspects of the system and Seymour would lead a team of about 8
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2101, p. 12
`
`

`

`engineers who would design the logic that would go into the ASICs. This structure
`
`would continue our very close working relationship.
`
`24. Unfortunately, overcoming these major issues associated with the ASICs and
`
`the complex module assembly process had made the Cray-3 very late to market and
`
`the underlying ASIC performance was not as good as what we knew we could now
`
`build. Consequently, after fielding just one machine at the National Center for
`
`Atmospheric Research, most of us would focus on the Cray-4. This new system
`
`would leverage all that we had learned on the Cray-3 and would be assembled in a
`
`similar fashion but with 2X faster GaAs ASICs that each contained 10x more
`
`circuitry. Of course, these faster ASICs would require new faster test systems and a
`
`new test program for every ASIC.
`
`25. At that time there was no commercial equipment that we could leverage so the
`
`entire wafer level test system was designed and built internally. One day, while
`
`debugging the first of these test systems, a circuit board caught fire. It was one of
`
`about 100 that fed the inputs to the ASICs so the impact was minimal. However,
`
`within 30 minutes the stock analysts had somehow found out and were already
`
`calling the front office asking what the impact would be to the Cray-4 program. This
`
`just highlighted to all of us the criticality of our testing efforts to the overall
`
`company.
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2101, p. 13
`
`

`

`26. In the end, the HPC markets were now in enough of a state of confusion that
`
`the Cray-4 was not enough to keep the company going as discussed more thoroughly
`
`by Seymore himself in the Newcray Business Plan:4
`
`
`
`
`
`27. The day we closed the company and informed the employees in March of
`
`1995, Seymour said to me "If we could just mothball what we have for five years the
`
`government would be clamoring for it". This statement impacted me because it
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2101, p. 14
`
`

`

`confirmed we were on the right track long-term for building HPC systems, but
`
`needed the customers to exhaust other inefficient systems first. In the end he was
`
`right as after just a few years, the NSA started complaining that no one was
`
`producing vector processors and they started pumping money into Cray Research to
`
`keep them going in that direction. Unfortunately, Seymour would not live to see this
`
`happen.
`
`28. After helping liquidate Cray Computer, I would spend the next year as
`
`Manager of Portable Product Engineering for Apple, where my exposure to
`
`microprocessor designs, as all of my years in HPC at Cray, would have significant
`
`influence on all the decisions that we would make at SRC Computers.
`
`B. SRC Computers
`29. In June of 1996, Seymour called and wanted to meet for lunch. At that
`
`meeting he presented me an offer package4 to join him at a new company he was
`
`starting. The idea was to go after the same HPC markets and customers that we had
`
`at Cray Computer Corporation but do it in a more cost-effective way. One of the key
`
`elements of this plan was to utilize Intel microprocessors but to do so in a very novel
`
`way that no other company had the understanding or expertise to do. It is particularly
`
`noted in the business plan that HP did not have the expertise to do what we were
`
`going to do and that as part of the plan we would share our technical capability with
`
`them.4 The real differentiator of Newcray, ultimately called SRC Computers, systems
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2101, p. 15
`
`

`

`would be that we would implement a classic Cray architecture having
`
`microprocessors connected through a crossbar switch to common memory banks.
`
`
`
`
`
`July 1996 White Board in Jon Huppenthal's Office Showing Initial
`SRC-6 Block Diagram
`
`
`
`30. To accomplish this, it meant that we had to design a high bandwidth crossbar
`
`switch to connect multiple Intel microprocessors to multiple memory banks that we
`
`would also have to design. The typical way of accomplishing such custom logic
`
`functions as practiced by our competitors was to develop ASICs. However, since
`
`these would require the same performance level as the microprocessors, ASICs
`
`would have to be fabricated using the same leading-edge semiconductor fabrication
`
`process as the microprocessors. My experience at Cray told me that to get the
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2101, p. 16
`
`

`

`performance level that we wanted, particularly in the relatively low volumes that we
`
`would consume, would probably cost us several hundred million dollars. In addition,
`
`the rate at which we made feature additions and improvements at Cray would mean
`
`that these ASICs would probably have to be updated relatively often thus incurring
`
`additional cost. Not wanting to repeat my earlier ASIC experience, I suggested that
`
`we could find a way to accomplish what we needed using commodity Field
`
`Programmable Gate Arrays (FPGAs). These devices consisted of an array of
`
`identical circuit blocks that the user could program to perform whatever function they
`
`desired. Consequently, we could accomplish the custom designs that we required
`
`without the need to design, fabricate and test ASICs.
`
`31. In the summer of 1996, the highest clock rate FPGAs were built by Lucent.
`
`While Xilinx and Altera produced FPGAs that could hold somewhat more circuitry,
`
`the rate at which we could run them was significantly lower than the Lucent parts. As
`
`a result, we decided to repeat the path that we had followed at Cray and go with the
`
`smaller but faster parts. This was again a very counter intuitive choice since most
`
`HPC designers at the time were trying to fit as much functionality as possible into a
`
`single chip to simplify the inter-chip communication and overall system design.
`
`Since most of the designers at SRC came from CCC, they were already very
`
`experienced at partitioning their designs to efficiently use multiple small chips.
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2101, p. 17
`
`

`

`C. SRC-6 Hi-Bar Crossbar Switch
`32. As we detailed out the design in the late summer of 1996, it became apparent
`
`that there would be three major FPGA design efforts. The first would be a Bridge
`
`Chip designed to interface the commodity microprocessor to the Cross Bar Switch.
`
`The second would be the Cross Bar Switch itself and the third would be the Common
`
`Memory Banks. We decided that I would design the Cross Bar Switch first. Given
`
`the number of I/O pins on the largest FPGA package, it was decided that the switch
`
`would be made in two tiers such that the first tier would connect to a group of
`
`microprocessors and the second physically separate tier to 16 memory banks. This
`
`would allow the output of one tier to be connected to the inputs of the second such
`
`that all processors could access up to 256 memory banks in a fully populated 16
`
`segment system. The switch tiers were duplicated for both the read and write paths to
`
`memory. For a variety of reasons, we vacillated on the number of microprocessors
`
`that made up a group as being between 16 and 20. Ultimately, we built the first
`
`switches assuming 20 but only populated connectors for 16.
`
`
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2101, p. 18
`
`

`

`Slide from February 1997 Showing Half of the SRC-6 Interconnect6
`
`33. To accomplish this design, each tier was built as two large high layer count
`
`
`
`circuit cards each containing 27 interconnected FPGAs. Two cards dealt with data
`
`traffic going to the memory banks and two with traffic coming from the memory
`
`banks. We would use 16 of each of the four switch board designs in a fully
`
`configured 16 segment system consuming 1728 FPGAs just for the switch. As we got
`
`into the details of the FPGA designs for the various switch chips we discovered that
`
`the Lucent design tools of the day could not adequately control the time delay of the
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2101, p. 19
`
`

`

`quantity of signals inside an FPGA that we needed. To achieve the high performance
`
`that we required, I ended up hand selecting all the routing resources in the FPGA so
`
`as to insure equal performance of all signal paths through the switch FPGAs.
`
`Basically, I hand routed all the switch FPGAs. To physically fit all of these FPGAs
`
`and I/O connectors on a single circuit card required a very large, high layer count
`
`printed circuit board.
`
`
`
`One of Four Original SRC-6 17" x 22" Cross Bar Switch Boards
`
`
`
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2101, p. 20
`
`

`

`34. However, due to the very high quantity of impedance controlled high speed
`
`circuit board traces that were required to interconnect all of the FPGAs, it proved not
`
`physically possible to manufacture the circuit boards using standard PCB processing
`
`techniques of the day. This was because of the basic physics involved. To achieve the
`
`best desired interconnect signal quality, the board must maintain a specific
`
`impedance along its traces. This impedance is determined by the width and thickness
`
`of the trace, its distance to the nearest reference plane, and the type of material used
`
`to separate the two. This will then tell you how thick a single layer of the board must
`
`be. As you interconnect all the chips during the layout of the PCB you find out how
`
`many layers will be required to completely interconnect all chips without any traces
`
`crossing each other on the same layer. Now to interconnect the surface pads under
`
`each FPGA to the signals on the inner layers of the board, you must drill a hole
`
`through the board and then plate the hole. This is called a via. The state of PCB
`
`manufacturing at any point in time will tell you how deep you can drill and plate for a
`
`given diameter hole which is called the aspect ratio. The larger the hole diameter, the
`
`thicker the board you can drill through. On top of that, the pad pattern of the FPGA
`
`package will determine what the largest diameter hole is that you can use without
`
`shorting two pads together. Because of these aspect ratio issues, in the time frame
`
`that were building these boards, no traditional board shop was capable of
`
`manufacturing the roughly 50 layer thick and very large boards that we required. This
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2101, p. 21
`
`

`

`problem threatened to completely derail the program since we could not reduce the
`
`FPGA count and be able to achieve the system performance that we needed. After a
`
`global search we came across a circuit board technology called Multiwire. This
`
`process was only available from one shop in Japan and one in Georgia. What it did
`
`was to embed very small insulated wires into the circuit board resin instead of
`
`etching away copper like traditional circuit boards. Since the wires were insulated
`
`they could cross over each other in a single layer unlike regular board traces. This
`
`resulted in a about a 4x to 6x reduction in the number of layers required, which
`
`reduced the aspect ratio of the via by about the same amount. Consequently, they
`
`were able to produce our switch boards as designed using this technology.
`
`
`
`
`17"x22" Multiwire SRC-6 Switch Board Layer Showing Close Up of Wires
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2101, p. 22
`
`

`

`
`35. Over the years as FPGA process geometries improved and internal feature
`
`sizes shrunk, we were able to put more of the switch logic in each FPGA. This
`
`allowed us to reduce the number of FPGAs required to build the switch, thus
`
`reducing the amount of chip to chip interconnect. This reduction in interconnect
`
`reduced the PCB layer count and ultimately allowed us to stop using Multiwire
`
`boards and move back to traditional printed circuit boards. In March of 2004, we
`
`filed for trademark protection of the name Hi-Bar for our switch, which was issued
`
`in August of 2005 and is still used today.
`
`36. In a multi-processor HPC system with common memory banks such as we
`
`were building, it is imperative that all processors have equal access to all memory4.
`
`This is referred to as a Symmetric Multi-Processor computer system or SMP system.
`
`To accomplish this symmetric access, all portions of the switch must be in
`
`communication with each other. This allows memory accesses to be equitably
`
`granted to prevent any one processor from blocking access to a memory bank by
`
`other processors. Such a switch arbitration scheme that could be implemented in an
`
`FPGA and coordinate the routing activities of up to 1728 FPGAs that made up our
`
`switch did not exist. As a result, we had to design one which resulted in the first
`
`issued SRC Computers patent number 6,026,459.
`
`37. Unfortunately, before we ever got to this point in the design process, Seymour
`
`would pass away as the result of a car accident. The HPC community truly lost a
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2101, p. 23
`
`

`

`visionary (http://pages.cs.wisc.edu/~bezenek/cray.html, last accessed July 11, 2019).
`
`From that point on I was the final decision maker for all technical aspects of the
`
`system.
`
`D. SRC-6 Processor
`38. The next portion of the system to be dealt with was the microprocessor board.
`
`As discussed in the Newcray business plan4, we intended to use Intel's upcoming
`
`processor code named Merced. This processor had been initially designed by Hewlett
`
`Packard and was to be fabricated by Intel. It was also going to be Intel's first offering
`
`with a 64-bit address bus and 64-bit registers. We felt that this would finally be
`
`adequate to address the large common memory that HPC applications required, as
`
`well as having high enough performance for our HPC customers. In the summer of
`
`1996, as we started having meetings with Intel, significant issues with Merced started
`
`to come to light. When the engineers at HP initially designed the processor, their
`
`primary focus was to include all of the high-end processor features that they felt they
`
`would need, which was also well aligned with what we wanted for HPC.
`
`Unfortunately, this design did not appropriately take into account Intel's design rules.
`
`These rules were very important and are what allowed Intel to achieve the very high
`
`manufacturing yields that they were known for. The end result was that Merced,
`
`which would carry the product name of Itanium, was not going to be available in
`
`1997 as we had expected. Redesign cycles and production issues ended up ultimately
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2101, p. 24
`
`

`

`causing its release to be delayed until June of 2001. It was clear to us by late summer
`
`of 1996 that an alternative to this processor had to be found.
`
`39. When it comes to HPC architectures there were two basic models. The SMP
`
`model that we were pursuing and the MPP or Massively Parallel Processing model.
`
`The basic difference between the two is that MPP tries to use hundreds or thousands
`
`of low performance processors all working together, whereas SMP used a small
`
`number of high performance processors. Once, when Seymour Cray was asked about
`
`his thoughts on MPP he replied, "If you were plowing a field, which would you
`
`rather use? Two strong oxen or 1024 chickens?" The big problem with the MPP
`
`strategy at that point in time was that it was very difficult to program and coordinate
`
`a large number of processors to accomplish a single task. Seymour himself talks
`
`about this in the Newcray Business Plan4. Even today, very few computer
`
`applications can even take advantage of the multiple microprocessors found in a
`
`standard microprocessor packaged device. Since we were developing an SMP
`
`system, our choices for microprocessors for use in the SRC-6 were limited to the
`
`highest performance full featured microprocessors of the day, which primarily came
`
`from Intel with whom we already had a relationship.
`
`40. To carry Seymour's ox and chicken quote a bit further I would add that
`
`choosing one over the other also has additional impact. While both the ox and
`
`chicken are generally categorized as livestock, there are distinct differences between
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2101, p. 25
`
`

`

`the two. At night, when it is time to put them in the barn, the chicken farmer can
`
`simply pick up a chicken and put it in the barn. However, since an ox weights 1000x
`
`more than a chicken, the chicken farmer's method of accomplishing the same task is
`
`not relevant prior art, even the though the end result of putting livestock in the barn is
`
`the same for both. Prior art methods are only relevant, and would be obvious to
`
`explore, if they apply to the features of the technology employed. In our case, the
`
`design of an HPC SMP system required the use of high end Intel microprocessors,
`
`which themselves had many unchangeable features much like the weight of an ox.
`
`Therefore, solutions that were developed for other low end processors without the
`
`restrictions that the required high end processors had, often become irrelevant. This
`
`meant that we had to make high end Intel processors work for us and it did not make
`
`any sense to spend much time exploring what low end processor designers were up
`
`to.
`
`41. In those days Intel was made up of two camps. There was the 64-bit group that
`
`was developing Merced with all kinds of new features and consequently problems,
`
`and the 32-bit group which had developed all of Intel's previously successful
`
`products and was on a more evolutionary development path. We immediately started
`
`fresh discussions with 32-bit group about what other processors were in development
`
`and nearing release. Their new high-end offering was code named Deschutes, while
`
`they were not permitted by Intel to use 64 address bits, they were using 36. This
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2101, p. 26
`
`

`

`would allow us to offer 64Gbytes of shared common memory which was at least 4
`
`times greater than the largest Cray 3. Best of all, we could get samples in 1997 and
`
`production parts in 1998.5 Armed with that information in January, 1997 we made
`
`the decision to go with this processor instead of Merced.
`
`
`
`42. With the processor nailed down it was time to work out the details of the
`
`processor board design. All of our previous designs using custom processors allowed
`
`the processor address and data bus to connect directly to the switch circuitry and on
`
`to memory. This meant that a complete memory access on the Cray 3 was completed
`
`in 22ns or 5 1/2 processor clocks.3 Given the Deschutes' 100Mhz processor bus
`
`speed, and assuming the same number of clocks for a memory access meant that it
`
`should take about 55ns to access memory. Unfortunately, to both Seymour and my
`
`surprise, Intel informed us that there was on the order of 10-20 clock cycles required
`
`for the bus protocol alone so random accesses to memory would be much slower than
`
`we expected. This was probably the first indication that trying to adapt a commodity
`
`microprocessor to the HPC market was not going to be straightforward.
`
`Unfortunately, we did not have much choice but to move forward and use our
`
`technical skills to find ways around these problems.
`
`43. The Deschutes slot 2 Xeon Pentium III processor was the highest performing
`
`processor in the 32 bit

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket