Samsung Electronics Co., Ltd. v. NVIDIA Corporation, 3:14-cv-00757, No. 1-3 (E.D.Va. Nov. 4, 2014)

Case 3:14-cv-00757-REP-DJN Document 1-3 Filed 11/04/14 Page 1 of 33 PageID# 51
`Case 3:14-cv-OO757-REP-DJN Document 1-3 Filed 11/04/14 Page 1 of 33 Page|D# 51
`
`EXHIBIT A
`EXHIBIT A
`
`

`Case 3:14-cv-00757-REP-DJN Document 1-3 Filed 11/04/14 Page 2 of 33 PageID# 52
`ill ill ii ilium in mi n iiI'll
`
`US005860158A
`
`United States Patent
`Pai et al.
`
`im
`
`[ii] Patent Number:
`
`[45] Date of Patent:
`
`5,860,158
`Jan. 12,1999
`
`[54] CACHE CONTROL UNIT WITH A CACHE
`REQUEST TRANSACTION-ORIENTED
`PROTOCOL
`
`[75]
`
`Inventors: Yet-Ping Pai, Milpitas; Le T. Nguyen,
`Monte Sereno, both of Calif.
`
`[73] Assignee: Samsung Electronics Company, Ltd.,
`Seoul, Rep. of Korea
`
`[21] Appl. No.: 751,149
`
`[22] Filed:
`
`Nov. 15, 1996
`
`[51]
`Int. CI.6
`[52] U.S. CI
`[58] Field of Search
`
`G06F 13/00
`711/118; 711/130
`711/119, 120,
`711/130, 140, 118
`
`[56]
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`
`9/1987 Kceley et al
`4,695,943
`10/1987 Thompson et al
`4,701,844
`4,707,784 11/1987 Ryan et al
`
`711/140
`711/119
`711/140
`
`2/1990 Sachs et al
`4,899,275
`5,377,345 12/1994 Chang et al
`5,418,973
`5/1995 Ellis et al
`5,524,265
`6/1996 Balmer et al
`5,574,849 11/1996 Sonnier et al
`5,659,782
`8/1997 Senter et al
`
`711/3
`395/425
`395/800
`711/212
`395/182.1
`395/800.23
`
`Primary Examiner—Tod R. Swann
`Assistant Examiner—Fc\ix B. Lee
`Attorney, Agent, or Firm—Skjerven, Morrill, MacPherson,
`Franklin & Friel, L.L.P.; Stephen A. Terrile
`
`[57]
`
`ABSTRACT
`
`A cache control unit and a method of controlling a cache.
`The cache is coupled to a cache accessing device. A first
`cache request is received from the device. A request iden
`tification information is assigned to the first cache request
`and provided to the requesting device. The first cache
`request may begin to be processed. A second cache request
`is received from the cache accessing device. The second
`cache request is assigned to the first cache request and
`provided to the requesting device. The first and second cache
`requests are finally fully serviced.
`
`37 Claims, 8 Drawing Sheets
`
`ARM_CCU INTERFACE STATE MACHINE (A_SM)
`
`I10
`
`START1
`
`

`Case 3:14-cv-00757-REP-DJN Document 1-3 Filed 11/04/14 Page 3 of 33 PageID# 53
`
`U.S. Patent
`
`Jan. 12, 1999
`
`Sheet 1 of 8
`
`5,860,158
`
`100
`
`PROCESSING CORE 102
`
`THT
`
`TPT
`
`CACHE SYSTEM 130
`
`ROM 150
`
`CACHE CONTROL UNIT 160
`
`z1-
`V
`
`GENERAL PURPOSE
`PROCESSOR
`110
`
`7^
`
`CO
`
`XZ
`
`ICACHE
`142
`
`DCACHE
`144
`
`1407
`
`VECTOR PROCESSOR
`120
`
`77CO
`
`ICACHE
`172
`
`DCACHE
`174
`
`170
`
`tTt
`
`SYSTEM TIMER
`182
`
`V
`
`i>
`
`UART 184 C )
`
`BITSTREAM
`PROCESSOR 186
`
`V
`
`in
`
`So
`
`INTERRUPT
`CONTROLLER 188
`
`i>
`V
`
`FIG. 1
`
`c
`
`LOCAL BUS
`INTERFACE
`196
`
`

`Case 3:14-cv-00757-REP-DJN Document 1-3 Filed 11/04/14 Page 4 of 33 PageID# 54
`
`U.S. Patent
`
`Jan. 12,1999
`
`Sheet 2 of 8
`
`5,860,158
`
`130
`
`SRAM 210
`
`TftG 212
`
`ROM 150
`
`140
`
`170
`
`GPP ICACHE
`142A
`
`GPP DCACHE
`MM
`
`TAG
`142B
`
`TAG
`144B
`
`VECTOR ICACHE
`172A
`
`TAG
`172E
`
`VECTOR DCACHE
`174A
`
`TAG
`
`214 216
`
`TAG
`15QB
`
`ROM CACHE
`15QA
`
`CACHE CONTROL
`UNIT
`160
`
`DATA PIPELINE
`220
`
`ADDRESS PIPELINE
`230
`
`232 234 236
`
`VECTOR (INSTR)
`VECTOR (DATA)
`
`GPP
`
`VECTOR (INSTR)
`VECTOR (DATA)
`VECTOR (DATA)
`GPP
`
`FBUS
`
`I0BUS
`
`FIG. 2
`
`

`Case 3:14-cv-00757-REP-DJN Document 1-3 Filed 11/04/14 Page 5 of 33 PageID# 55
`Case 3:14-cv-OO757-REP-DJN Document 1-3 Filed 11/04/14 Page 5 of 33 Page|D# 55
`
`US. Patent
`
`Jan. 12,1999
`
`Sheet 3 of 8
`
`5,860,158
`
`110
`PROCESSOR
`PURPOSE
`GENERAL
`
`120
`PROCESSOR
`VECTOR
`
`J
`
`360
`GPREADMUX
`
`MUX340
`CACHEREAD
`
`MUX350
`CACHEWRITE
`
`5x22mam;
`
`180
`I0BUS
`
`320
`IOMUX
`
`310
`FBUSMUX
`
`220
`
`SRAMM
`
`ROM150
`
`FIG.3
`
`L
`
`190
`FBUS
`
`330
`BUFFER
`
`«888%.cm,
`._<Ezmo”85%
`
`mmommammommmoomm
`%Im.uE
`
`
`
`
`

`Case 3:14-cv-00757-REP-DJN Document 1-3 Filed 11/04/14 Page 6 of 33 PageID# 56
`
`U.S. Patent
`
`Jan. 12, 1999
`
`Sheet 4 of 8
`
`5,860,158
`
`

`Case 3:14-cv-00757-REP-DJN Document 1-3 Filed 11/04/14 Page 7 of 33 PageID# 57
`
`FIG.5
`
`180
`IOBUS
`
`GPP
`
`T—110
`
`M20
`
`VECTORPROCESSOR
`
`190
`FBUS
`
`I
`
`J
`
`MEM_ADR_LAT
`
`31
`
`ADR_Q0JWBJ.AT
`
`WBJ_AT_TMP
`
`522
`UD_TAG
`
`ULTAG
`
`RD_ADR_Q
`
`WR_ADR_Q
`
`1
`
`i,(;i
`ii
`ii
`WR_ADR_MUXJ_AT
`
`RETURN.ADR
`
`RETURNJD
`
`ADR_Q1-
`
`ADR.Q2
`
`ADR.Q3
`
`h
`
`n
`
`u
`
`n
`
`J
`iiT
`RD_ADR_MUX_LAT
`
`520
`
`510
`
`COMPARATOR
`J_
`521
`
`COMPARATOR
`
`TAG.OUT
`
`511
`
`550
`
`RAM_WRITE_ADR
`
`MCP_BASE
`
`CACHE_R0M
`
`J__
`150
`
`RD.TAG
`J-
`506-2
`
`WR_TAG
`
`506-1
`
`CACHE.RAM
`J—
`210
`
`

`Case 3:14-cv-00757-REP-DJN Document 1-3 Filed 11/04/14 Page 8 of 33 PageID# 58
`
`U.S. Patent
`
`Jan. 12, 1999
`
`Sheet 6 of 8
`
`5,860,158
`
`ARM_CCU INTERFACE STATE MACHINE (A_SM)
`
`START*
`
`FIG. 6
`
`CCILFBUS INTERFACE STATE MACHINE (F_SM)
`
`ITO
`
`FIG. 7
`
`

`Case 3:14-cv-00757-REP-DJN Document 1-3 Filed 11/04/14 Page 9 of 33 PageID# 59
`
`U.S. Patent
`
`Jan. 12,1999
`
`Sheet 7of 8
`
`5,860,158
`
`DATA RECEIVER STATE MACHINE (D_SM)
`
`FIG. 8
`
`

`Case 3:14-cv-00757-REP-DJN Document 1-3 Filed 11/04/14 Page 10 of 33 PageID# 60
`
`U.S. Patent
`
`Jan. 12, 1999
`
`Sheet 8 of8
`
`5,860,158
`
`READ STATE MACHINE
`(RD_SM)
`
`WRITE STATE MACHINE
`(WR_SM)
`
`FIG. 9
`
`

`Case 3:14-cv-00757-REP-DJN Document 1-3 Filed 11/04/14 Page 11 of 33 PageID# 61
`
`5,860,158
`
`CACHE CONTROL UNIT WITH A CACHE
`REQUEST TRANSACTION-ORIENTED
`PROTOCOL
`
`COPYRIGHT NOTICE
`
`A portion of the disclosure of this patent document
`contains material which is subject to copyright protection.
`The copyright owner has no objection to the facsimile
`reproduction by anyone of the patent document or the patent
`disclosure, as it appears in the Patent andTrademark Office
`patent file or records, but otherwise reserves all copyright
`rights whatsoever.
`
`BACKGROUND OF THE INVENTION
`
`1. Field of the Invention
`This invention relates to providing processors with fast
`memory access and, more particularly, to providing control
`of cache memory systems.
`2. Description of the Related Art
`Processors often employ memories which are relatively
`slow when compared to the clock speeds of the processors.
`To speed up memory access for such processors, a relatively
`small amount of fast memory can be used in a data cache.
`A cache can mediate memory accesses and lessen the
`average memory access time for all or a large portion of the
`address space of a processor even though the cache is small
`relative to the address space. Caches do not occupy a
`specific portion of the address space of the processor but
`instead include tag information which identifies addresses
`for information in lines of the cache.
`Typically, a cache compares an address received from a
`processor to tag information stored in the cache to determine
`whether the cache contains a valid entry for the memory
`address being accessed. If such a cache entry exists (i.e. if
`there is a cache bit), the processor accesses (reads from or
`writes to) the faster cache memory instead of the slower
`memory. In addition to tag information, a cache entry
`typically contains a "validity" bit and a "dirty" bit which
`respectively indicated whether the associated information in
`the entry is valid and whether the associated information
`contains changes to be written back to the slower memory.
`If there is no cache entry for the address being accessed (i.e.
`there is a cache miss), access to the slower memory is
`required for the cache to create a new entry for the just
`accessed memory address.
`Caches use cache policies such as "least recently used" or
`"not last used" replacement techniques to determine which
`existing entries are replaced with new entries. Typically,
`computer programs access the same memory addresses
`repeatedly. Therefore, the most recently accessed data is
`likely to be accessed again soon after the initial access.
`Because recently accessed data is available in the cache for
`subsequent accesses, caches can improve access time across
`the address space of the processor.
`A different method for increasing processor speed is the
`use of parallel processing techniques. For example, by
`providing a number of functional units which perform
`different
`tasks, a "very long instruction word" (VLIW)
`processor can perform multiple functions through a single
`instruction. Also, a general purpose processor and a vector
`processor may be integrated to operate in parallel. An
`integrated multiprocessor is able to achieve high perfor
`mance with low cost since the two processors perform only
`tasks ideally suited for each processor. For example,
`the
`general purpose processor runs a real time operating system
`
`and performs overall system management while the vector
`processor is used to perform parallel calculations using data
`structures called "vectors". (A vector is a collection of data
`elements typically of the same type.) Multiprocessor con-
`5 figurations are especially advantageous for operations
`involving digital signal processing such as coding and
`decoding video, audio, and communications data.
`
`20
`
`SUMMARY OF THE INVENTION
`It has been discovered that accesses to a cache by multiple
`10 devices may be managed by a cache control unit that
`includes transaction identification logic to identify cache
`accesses. Such an apparatus provides the advantage of
`improving performance by increasing the speed of memory
`accesses by one or more devices. Specifically, such an
`15 apparatus allows thecache toservice later arriving requests
`before earlier arriving requests.
`In one embodiment of the present invention, a cache is
`coupled to a cache accessing device. A first cache request is
`received from the device. A request identification informa
`tion is assigned to the first cache request and provided to the
`requesting device. The first-cache request may begin to be
`processed. A second cache request
`is received from the
`cache accessing device. The second cache request
`is
`assigned to the first cache request and provided to the
`requesting device. The first and second cache requests are
`finally fully serviced.
`In another embodiment, a cache system includes a cache
`for temporarily storing information and a cache control unit.
`The cache control unit includes access control logic, iden
`tification logic, and result logic. The access control logic
`receives and executes cache accesses by a cache accessing
`device. The identification logic assigns request identification
`information to each of the cache accesses, and provides the
`request
`identification information to the cache accessing
`device. The identification logic is capable of providing the
`request identification information prior to the execution of
`the cache accesses by the access control
`logic. The result
`logic provides the request identification information and the
`data requested by the cache accessing device to the cache
`accessing device if the cache access was a read.
`
`25
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`The present invention may be better understood, and its
`numerous objects, features, and advantages made apparent
`45 to those skilled in the art by referencing the accompanying
`drawings.
`FIG. 1 shows a block diagram of a multimedia signal
`processor in accordance with an embodiment of the inven
`tion.
`FIG. 2 shows a block diagram of a cache system in
`accordance with an embodiment of the invention.
`FIG. 3 shows a block diagram of a data pipeline used in
`a cache system in accordance with an embodiment of the
`invention.
`FIG. 4 shows a block diagram of a data pipeline used in
`a cache system in accordance with an embodiment of the
`invention.
`FIG. 5 shows a block diagram of an address pipeline used
`60 in a cache system in accordance with an embodiment of the
`invention.
`FIG. 6 shows a state diagram of a cache control unit and
`processor interface in accordance with an embodiment of the
`invention.
`FIG. 7 shows a state diagram of a cache control unit and
`bus interface in accordance with an embodiment of the
`invention.
`
`65
`
`55
`
`

`Case 3:14-cv-00757-REP-DJN Document 1-3 Filed 11/04/14 Page 12 of 33 PageID# 62
`
`5,860,158
`
`FIG. 8 shows a state diagram of a data receiver slate
`machine in accordance with an embodiment of the inven
`tion.
`FIG. 9 shows a state diagram of a read/write state machine
`in accordance with an embodiment of the invention.
`The use of the same reference symbols in different draw
`ings indicates similar or identical items.
`
`5
`
`In that embodiment, vector processor 120 is
`executes.
`designed to perform computationally intensive tasks requir
`ing the manipulation of large data blocks, while general
`purpose processor 110 acts as the master processor to vector
`processor 120.
`In the exemplary embodiment, general purpose processor
`110 is a 32-bit RISC processor which operates at 40 Mhz and
`conforms to the standard ARM7 instruction set. The archi
`tecture for an ARM7 reduced instruction set computer
`(RISC) processor and the ARM7 instruction set is described
`in the ARM7DM Data Sheet available from Advanced RISC
`Machines Ltd. General purpose processor 110 also imple
`ments an extension of the ARM7 instructions set which
`includes instructions for an interface with vector processor
`120. The extension to the ARM7 instruction set for the
`exemplary embodiment of the invention is described in
`copending, U.S. patent application Ser. No. 08/699,295,
`attorney docket No. M-4366 U.S., filed on Aug. 19, 1996,
`entitled "System and Method for Handling Software Inter
`rupts with Argument Passing," naming Seungyoon Peter
`Song, Moataz A. Mohamed, Heon-Chul Park and Le
`Nguyen as inventors, which is incorporated herein by ref
`erence in its entirety. General purpose processor 110 is
`coupled to vector processor 120 by control bus 112 to carry
`out the extension of the ARM7 instruction set. Furthermore,
`interrupt line 114 is used by vector processor 120 to request
`an interrupt on general purpose processor 110.
`In the exemplary embodiment, vector processor 120 has
`a single-instruction-multiple-data (SIMD) architecture and
`manipulates both scalar and vector quantities. In the exem
`plary embodiment, vector processor 120 consists of a pipe
`lined reduced instruction set computer (RISC) central pro
`cessing unit (CPU) that operates at 80 Mhz and has a 288-bit
`vector register file. Each vector register in the vector register
`file can contain up to 32 data elements. A vector register can
`hold thirty-two 8-bit or 9-bit integer data elements, sixteen
`16-bil
`integer data elements, or eight 32-bit
`integer or
`the exemplary
`floating point elements. Additionally,
`embodiment can also operate on a 576-bit vector operand
`spanning two vector registers.
`The instruction set for vector processor 120 includes
`instructions for manipulating vectors and for manipulating
`scalars. The instruction set for the exemplary embodiment of
`the invention and an architecture for implementing the
`instruction set
`is described in the pending U.S. patent
`application Ser. No. 08/699,597, attorney docket No.
`M-4355 U.S., filed on Aug. 19, 1996, entitled "Single-
`Instruction-Multiple-Data Processing in a Multimedia Sig
`nal Processor," naming Le Trong Nguyen as inventor, which
`is incorporated herein by reference in its entirety.
`General purpose processor 110 performs general tasks and
`executes a real-time operating system which controls com
`munications with device drivers. Vector processor 120 per
`forms vector tasks. General purpose processor 110 and
`vector processor 120 may be scalar or superscalar proces
`sors. The multiprocessor operation of the exemplary
`embodiment of the invention is more fully described in
`pending U.S. patent application Ser. No. 08/697,102, attor
`ney docket No. M-4354 U.S., filed on Aug. 19, 1996,
`entitled "Multiprocessor Operation in a Multimedia Signal
`Processor," naming Le Trong Nguyen as inventor, which is
`incorporated herein by reference in its entirety.
`Referring again to FIG. 1, cache system 130 contains a
`fast random access memory (RAM) block (shown graphi
`cally as blocks 140 and 170), read only memory (ROM) 150
`and a cache control unit 160. Cache system 130 can con-
`
`io
`
`30
`
`DESCRIPTION OF THE PREFERRED
`EMBODIMENT^)
`The following sets forth a detailed description of the
`preferred embodiments. The description is intended to be
`illustrative of the invention and should not be taken to be
`limiting. Many variations, modifications, additions and
`improvements may fall within the scope of the invention as
`defined in the claims that follow.
`Referring to FIG. 1, processor 100 includes a general
`purpose processor 110 coupled to a vector processor 120.
`General purpose processor 110andvectorprocessor 120are ,0
`coupled via control bus 112 and interrupt line 114. General
`purpose processor 110 and vector processor 120 are coupled
`to cache system 130 via bus 116 and bus 118, respectively.
`Cache system is coupled to input/output bus (IOBUS) 180
`and fast bus (FBUS) 190. IOBUS 180 is coupled to system 25
`timer 182, universal asynchronous receiver-transmitter
`(UART) 184, bilstream processor 186 and interrupt control
`ler 188. FBUS 190 is coupled to device interface 192, direct
`memory access (DMA) controller 194, local bus interface
`196 and memory controller 198.
`General purpose processor 110 and vector processor 120
`execute separate program threads in parallel. General pur
`pose processor 110 typically executes instructions which
`manipulate scalar data. Vector processor 120 typically
`executes instructions having vector operands, i.e., operands 35
`each containing multiple data elements of the same type. In
`some embodiments, general purpose processor 110 has a
`limited vector processing capability. However, applications
`that require multiple computations on large arrays of data are
`not suited for scalar processing or even limited vector 40
`processing. For example, multimedia applications such as
`audio and video data compression and decompression
`require many repetitive calculations on pixel arrays and
`strings of audio data. To perform real-time multimedia
`operations, a general purpose processor which manipulates 45
`scalar data (e.g. one pixel value or sound amplitude per
`operand) or only small vectors must operate at a high clock
`frequency. In contrast, a vector processor executes instruc
`tions where each operand is a vector containing multiple
`data elements (e.g. multiple pixel values or sound 50
`amplitudes). Therefore, vector processor 120 can perform
`real-time multimedia operations at a fraction of the clock
`frequency required for general purpose processor 110 to
`perform the same function. Thus, by allowing an efficient
`division of the tasks required for, e.g., multimedia 55
`applications, the combination of general purpose processor
`110 and vector processor 120 provides high performance per
`cost. Although in the preferred embodiment, processor 100
`is for multimedia applications, processor 100 may be any
`type of processor.
`In one embodiment, general purpose processor 110
`executes a real-time operating system designed for a media
`circuit board communicating with a host computer system.
`The real-time operating system communicates with a pri
`mary processor of the host computer system, services input/ 65
`output (I/O) devices on or coupled to the media circuit
`board, and selects tasks which vector processor 120
`
`60
`
`

`Case 3:14-cv-00757-REP-DJN Document 1-3 Filed 11/04/14 Page 13 of 33 PageID# 63
`
`5,860,158
`
`5
`
`figure the RAM block into (i) an instruction cache 142 and
`a data cache 144 for general purpose processor 110, and (ii)
`an instruction cache 172 and data cache 174 for vector
`processor 120. In the preferred embodiment, RAM block
`140, 170 includes static RAM (SRAM).
`In an embodiment of a computer system according to the
`invention, general purpose processor 110 and vector proces
`sor 120 share a variety of on-chip and off-chip resources
`which are accessible through a single address space. Cache
`system 130 couples a memory to any of several memory 10
`mapped devices such as bitstream processor 186, UART
`184, DMA controller 194, local bus interface 196, and a
`coder-decoder (CODEC) device interfaced through device
`interface 192. Cache system 130 can use a transaction-
`orientedprotocolto implementa switchboard fordata access 15
`among the processors and memory mapped resources. For
`example, the transaction-oriented protocol provides that if
`completion of an initial cache transaction is delayed (e.g.,
`due to a cache miss), other cache access transactions may
`proceed prior to completionof the initial transaction. Thus, 20
`"step-aside-and-wait" capability is provided in this embodi
`ment of a cache management system according to the
`invention. A similar transaction-oriented protocol is further
`described in pending, U.S. patent application Ser. No.
`08/731,393, attorney docket No. M^398 U.S., filed on Oct. 25
`18,1996, entitled "Shared Bus System with Transaction and
`Destination ID," naming Amjad Z. Qureshi and Le Trong
`Nguyen as inventors, which is incorporated herein by ref
`erence in its entirety.
`Cache system 130 couples general purpose processor 110
`and vector processor 120 to two system busses: IOBUS 180
`and FBUS 190. IOBUS 180 typically operates at a slower
`frequency than FBUS 190. Slower speed devices are
`coupled to IOBUS 180, while higher speed devices are
`coupled to FBUS 190. By separating the slower speed
`devices from the higher speed devices, the slower speed
`devices are prevented from unduly impacting the perfor
`mance of the higher speed devices.
`Cache system 130 also serves as a switchboard for ^
`communication between IOBUS 180, FBUS 190, general
`purpose processor 110, and vector processor 120. In most
`embodiments of cache system 130, multiple simultaneous
`accesses between the busses and processors are possible. For
`example, vector processor 120 is able to communicate with .,
`FBUS 190 at the same time that general purpose processor
`110 is communicating with IOBUS 180. In one embodiment
`of the invention, the combination of the switchboard and
`caching function is accomplished by using direct mapping
`techniques for FBUS 190and IOBUS 180. Specifically, the 5Q
`devices on FBUS 190 and IOBUS 180 can be accessed by
`general purpose processor 110 and vector processor 120 by
`standard memory reads and write at appropriate addresses.
`FBUS 190 provides an interface to the main memory.The
`interface unit to the memory is composed of a four-entry 55
`address queue and a one-entry write-back latch. The inter
`face can support one pending refill (read) request from
`general purpose processor instruction cache 142, one pend
`ing refill (read) request from vector processor instruction
`cache 172, one write request from vector processor data gg
`cache 174, and one write-back request from vector processor
`data cache due to a dirty cache line.
`FBUS 190 is coupled to various high speed devices such
`as a memory controller 198 and a DMA controller 194, a
`local bus interface 196, and a device interface 192. Memory 65
`controller 198 and DMA controller 194 provide memory
`interfaces. Local bus interface 196 provides an interface to
`
`30
`
`35
`
`a local bus coupled to a processor. Device interface 192
`provides interfaces to various digital-to-analog and analog-
`to-digital converters (DACs and ACDs, respectively) that
`may be coupled to processor 100 for video, audio or
`communications applications.
`Memory controller 198 provides an interface for a local
`memory if a local memory is provided for processor 100.
`Memory controller 198 controls reads and writes to the local
`memory. In the exemplary embodiment, memory controller
`198 is coupled to and controls one bank of synchronous
`dynamic RAMs (two lMxl6 SDRAM chips) configured to
`use 24 to 26 address bits and 32 data bits and having the
`features of: (i) a "CAS-before-RAS" refresh protocol, per
`formed at a programmable refresh rate, (ii) partial writes that
`initiate Read-Modify-Writc operations, and (iii)
`internal
`bank interleave. Memory controller 198 also provides a 1:1
`frequency match between the local memory and FBUS 190,
`manual "both bank precharge", and address and data queu
`ing to better utilize FBUS 190. Synchronous DRAM are
`known to effectively ope/ate at such frequencies (80 MHz),
`and standard fast page DRAMs and extended data out
`(EDO) DRAMs could also be used. DRAM controllers with
`capabilities similar to memory controller 198 in the exem
`plary embodiment are known in the art.
`DMA controller 194 controls direct memory accesses
`between the main memory of a host computer and the local
`memory of processor 100. Such DMA controllers are well
`known in the art. In some embodiments of the invention, a
`memory data mover is included. The memory data mover
`performs DMA from one block of memory to another block
`of memory.
`Local bus interface 196 implements the required protocol
`for communications with a host computer via a local bus. In
`the exemplary embodiment, local bus interface 196 provides
`an interface to a 33-MHz, 32-bit PCI bus. Such interfaces are
`well known in the art.
`Device interface 192 provides a hardware interface for
`devices such as audio, video and communications DACs and
`ADCs which would typically be on a printed circuit board
`with a processor 100 adapted for multimedia applications.
`Device interface 192 may be customized for the particular
`application of processor 100. In particular, device interface
`192 might only provide an interface for specific devices or
`integrated circuits (ICs). Typical units within device inter
`face 192 provide an interface for connection of standard
`ADCs, DACs, or CODECs. Designs for ADC, DAC, and
`CODEC interfaces are well known in the art and not
`described further here. Other interfaces which might be
`employed include but are not
`limited to an integrated
`services digital network (ISDN) interface for digital tele
`phone and interfaces for busses such as for a microcbannel
`bus. In one embodiment of processor 100, device interface
`192 is an application specific integrated circuit (ASIC)
`which can be programmed to perform a desired functional
`ity.
`In the preferred embodiment, IOBUS 180 operates at a
`frequency (40 MHz) that is lower than the operating fre
`quency (80 MHz) of FBUS 190. Also in the preferred
`embodiment, IOBUS 180 is coupled to system timer 182,
`UART 184, bitstream processor 186, and interrupt controller
`188.
`System timer 182 interrupts general purpose processor
`110 at scheduled intervals which are selected by writing to
`registers corresponding to system timer 182. In the exem
`plary embodiment, system timer 182 is a standard Intel 8254
`compatible interval timer having three independent 16-bit
`counters and six programmable counter modes.
`
`

`Case 3:14-cv-00757-REP-DJN Document 1-3 Filed 11/04/14 Page 14 of 33 PageID# 64
`
`5,860,158
`
`5
`
`UART 184 is a serial interface which is compatible with
`the common 16450 UART integrated circuit. The 16450
`UART IC is for use in modem or facsimile applications
`which require a standard serial communication ("COM")
`port of a personal computer.
`Bitstream processor 186 is a. fixed hardware processor
`which performs specific functions on an input or output
`bitstream. In the exemplary embodiment, bitstream proces
`sor 186 performs initial or final stages of MPEG coding or
`decoding. In particular, bitstream processor 186 performs ]0
`variable length (Huffman) coding and decoding, and pack
`ing and unpacking of video data in "zig-zag" format. Bit-
`stream processor 186 operates in parallel with and under the
`control of general purpose processor 110 and vector proces
`sor 120. Processors 110 and 120 configure bitstream pro- 1S
`cessor 186 via control registers. An exemplary embodiment
`of bitstream processor 186 is described in pending U.S.
`patent application Ser. No. 08/699,303, attorney docket No.
`M-4368 U.S., filed on Aug. 19,1996, entitled "Methods and
`Apparatus for Processing Video Data,"naming CliffReader, 20
`Jae Cheol Son, Amjad Qureshi and Le Nguyen as inventors,
`which is incorporated herein by reference in its entirety.
`Interrupt controller 188 controls interrupts of general
`purpose processor 110 and supports multiple interrupt pri
`orities. A mask register is provided to allow each interrupt 2s
`priority to be individually masked.
`In the exemplary
`embodiment, interrupt controller 188 is programmable and
`implements the standard Intel 8259 interrupt system that is
`common in x86-based personal computers. A highest prior
`ity (level 0) interrupt
`is assigned to system timer 242. 30
`Priority levels 1, 2, 3, and 7 are respectively assigned to a
`virtual frame buffer, DMA controller 194 and device inter
`face 192, bitstream processor 186, local bus interface 196,
`and UART 184. Interrupt priority levels 4, 5, and 6 are
`unassigned in the exemplary embodiment of the invention. 35
`The virtual frame buffer at priority level 1, which is included
`in some embodiments of the invention, emulates a standard
`VGA frame buffer.
`Referring to FIG. 2, cache system 130 includes SRAM
`block 210, ROM 150, data pipeline 220, address pipeline 40
`230 and cache control unit 160. SRAM block 210, ROM 150
`and cache control unit 200 are each separately coupled to
`data pipeline 220 and to address pipeline 230. Data pipeline
`220 is coupled to IOBUS 180, FBUS 190, general purpose
`processor 110 and vector processor 120. Address pipeline 45
`230 is coupled to general purpose processor 110 and vector
`processor 120.
`SRAM block 210 is divided into four memory banks to
`form instruction cache 142 and data cache 144 for use with
`general purpose processor 110, as well as instruction cache 50
`172 and data cache 174 for use with vector processor 120.
`In any cycle, cache system 130 can accept one read request
`and one write request. SRAM block 210 is a dual-ported
`memory circuit, with read port 216 and write port 214, so
`that simultaneous reading and writing of SRAM block 210 55
`is supported. SRAM block 210 also contains a tag section
`212 which is subdivided into TAG 142B, TAG 144B, TAG
`172B and TAG 174B for each of the respective memory
`banks 142A, 144A, 172A and 174A. The tag RAM has two
`read ports. The read port address and the write port address 60
`can be compared with the internal cache tags for hit or miss
`condition. The tag information for each cache line includes
`a tag, two validity bits, two dirty bits, and use information.
`Each validity bit and dirty bit corresponds to a 32-byte half
`of a cache line which is equal
`to the amount of data 65
`transferred by a single read or write operation. Each dirty bit
`indicates a single 256-bit write to external memory, and each
`
`8
`indicates a single 256-bit read from external
`validity bit
`memory. The used bits are for the entry replacement scheme
`used to create new entries. Four sets of cache bank select
`signals and three sets of line indices are needed to access
`SRAM block 210.
`ROM 150 includes ROM cache field 150A and ROM tag
`field 150B. ROM 150 can be configured as a cache.
`Although tag field 150B cannot be modified,
`individual
`addresses can be marked as invalid so that data or instruc
`tions can be brought from memory to be used in place of the
`data or instructions in ROM 150. ROM 150 contains fre
`quently used instructions and data for general purpose
`processor 110 and vector processor 120. In the exemplary
`embodiment, ROM 150 contains: reset and initialization
`procedures; self-test diagnostics procedures; interrupt and
`exception handlers; and subroutines for soundblaster emu
`lation; subroutines for V.34 modem signal processing; gen
`eral telephony functions; 2-dimensional and 3-dimensional
`graphics subroutine libraries; and subroutine libraries for
`audio and video standards such as MPEG-1, MPEG-2,
`H.261, H.263, G.728, and G.723.
`Data pipeline 220 performs the data switchboard function
`of cache system 130. Data pipeline 220 is able to create
`multiple simultaneous data communication paths between
`IOBUS 180, FBUS 190, general purpose processor 110,
`vector processor 120 and SRAM block 210 and ROM 150.
`Similarly, address pipeline 230 performs switchboard func
`tions for addresses. In the embodiment of FIG. 2, IOBUS
`180 and FBUS 190 use time multiplexing for address and
`data signals. Cache control 160 provides the control lines to
`data pipeline 220 and address pipeline 230 to properly
`configure the communication channels.
`In some embodiments of cache system 130, a transaction-
`based protocol is used to support all read and write opera
`tions. Any unit coupled to cache system 130, such as general
`processor U0, vector processor 120, or the various devices
`coupled to IOBUS 180 and FBUS 190, can place a request
`to cache system 130. Such a request is formed by a device
`identification code ("device ID") and an address of the
`requested memory location. Each unit has a distinct device
`ID and cache system 130 can prioritize the requests based on
`the device ID of the unit making the request. When the data
`at the requested address becomes available, cache system
`responds with the device ID, a transaction identification
`code ("transaction ID"), the address, and the requested data.
`If the requested address is not contained in SRAM block 210
`or ROM 150, cache system 130 will not be able to respond
`to the specific request for several clock cycles while the data
`at the memory address is retrieved. However, while the data
`of a first request is being

This document is available on Docket Alarm but you must sign up to view it.

Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

Up-to-date information for this case.
Email alerts whenever there is an update.
Full text search for other cases.
Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.

Access Government Site

We are redirecting you
to a mobile optimized page.

Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket

Supplemental Search

Search for PTAB Motions

PTAB Analytics

TTAB Analytics

Basic Search

Filters

Party Search

Advanced

Selected Courts

Recently Selected Courts

Find PTAB Decisions

PTAB Analytics

Special PTAB Alerts

Orange Book

Directly Search Federal Courts

Search Trademark ...

This document is available on Docket Alarm but you must sign up to view it.

Accessing this document will incur an additional charge of $.

Still Working On It

A few More Minutes ... Still Working

This document could not be displayed.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

One Moment Please

Your document is on its way!

Sealed Document

We are redirecting youto a mobile optimized page.

Document Unreadable or Corrupt

We are unable to display this document.

STEP 2 of 2

Choose your membership type

Flat-Fee

Pay-As-You-Go

Add your payment information

Login or Join

Enter your corporate Email

Thousands of your peers are saving time and gaining a competitive advantage with Docket Alarm.

Join Docket Alarm to perform smarter legal research.

Download this document and millions of others instantly with a Docket Alarm membership.

Join Docket Alarm and start performing smarter legal research.

Start tracking this docket instantly with a Docket Alarm membership.

Join thousands of your peers and start performing smarter legal research.

STEP 1 of 2

Millions of Documents | 15 Seconds to Signup

Hi !

Welcome to Docket Alarm

Welcome to Docket Alarm!

Explore Litigation Insights andManage Your Cases

Reset Password

What is PACER?

Why do I need it?

What will I be charged?

Do other courts have fees?

Basic Free Access

Welcome

Thank you

Check Firm Account

We are redirecting you
to a mobile optimized page.

Explore Litigation Insights and
Manage Your Cases