`US005721862A
`(11) Patent Number:
`5,721,862
`United States Patent 09
`(45) Date of Patent:
`*Feb. 24, 1998
`Sartore et al.
`
`
`[54]
`
`[75]
`
`ENHANCED DRAM WITH SINGLE ROW
`SRAM CACHE FOR ALL DEVICE READ
`OPERATIONS
`
`Inventors: Ronald H. Sartore, San Diego, Calif.;
`Kenneth J. Mobley, Colorado Springs,
`Colo.; Donald G. Carrigan.
`Monument, Colo.; Oscar Frederick
`Jones, Jr., Colorado Springs, Colo.
`
`[73] Assignee:
`
`Ramtron International Corporation,
`Colorado Springs, Colo.
`
`{*] Notice:
`
`The term of this patent shall not extend
`beyond the expiration date of Pat. No.
`5,699,317.
`
`{21}
`
`[22]
`
`Appl. No.:
`Fiied:
`
`460,665
`
`Jun. 2, 1995
`
`Related U.S. Application Data
`
`[63] Continuation of Ser. No. 319,289, Oct. 6, 1994, which is a
`continuation-in-part of Ser. No. 824,211, Jan. 22, 1992,
`abandoned.
`
`[51]
`152]
`[58]
`
`[56]
`
`Tint, C0.© caesesessssssssocsssseseccensnsnsensssnnssnesseanecees GO6F 12/00
`US. Che ccccccne. 395/445; 365/189.04; 365/ 189.05
`Field of Searels .....:..csscescssesensccsseess 395/445, 433;
`365/189.02, 189.03, 189.04, 189.05, 230.08,
`230.06, 230.02, 49
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`
`4,577,293
`4,608,668
`4,725,945
`4,794,559
`4,870,622
`4,894,770
`4,926,385
`4,943,944
`5,025 421
`5,111,386
`
`........sccscessecsm ese 365/189
`
`3/1986 Matick et al.
`8/1986 Uchida .
`QAL988 Kromstadt ......cccsssecsseeascoensene 364/200
`
`12/1988 Greeberger
`......-scncccsssesserseene 365/49
`
`9/1989 Aria etal. ......
`. 365/230.02
`1/1990 Ward et al.
`....
`weve 364/200
`5/1990 Fujishima et al. ...........2+ 365/230.03
`
`7/1990 Sakui et al.
`...
`. 365/189.05
`G/LGGL CHO ...cecsseasecesonee
`» 365/230.5
`5/1992 Fujishima et al.
`.....-s--scsseeee 395/425
`
`(List continued on next page.)
`
`FOREIGN PATENT DOCUMENTS
`
`41 18 804A1
`60-258791
`63-81692
`1-159891
`
`12/1991 Germany.
`12/1985
`Japan .
`4/1988
`Japan .
`6/1989
`Japan.
`
`OTHER PUBLICATIONS
`
`Niijima, et al., “QRAM-Quick Access Memory System”,
`IEEE International Conference on Compute Design: V.L.S.1.
`In Computers and Processors, pp. 417-420(Sep.17, 1990).
`(List continued on next page.)
`
`Primary Examiner—Tod R. Swann
`Assistant Examiner—Keith W. Saunders
`Attorney, Agent, or Firm—William J. Kubida, Esq.; Richard
`A. Bachand, Esq-; Peter J. Meza, Esq.
`[57]
`ABSTRACT
`An enhanced DRAM contains embedded row registers in the
`form oflatches. The row registers are adjacent to the DRAM
`array, and when the DRAM comprises a group of subarrays,
`the row registers are located between DRAM subarrays.
`Whenused as on-chip cache, these registers hold frequently
`accessed data. This data corresponds to data stored in the
`DRAM at a particular address. When an address is supplied
`to the DRAM,it is compared to the address ofthe data stored
`in the cache. If the addresses are the same, then the cache
`data is read at SRAM speeds. The DRAM is decoupled from
`this read. The DRAM also remainsidle during this cache
`read unless the system opts to precharge or refresh the
`DRAM.Refresh or precharge occur concurrently with the
`cache read. If the addresses are not the same,
`then the
`DRAM is accessed and the embedded register is reloaded
`with the data at that new DRAM address. Asynchronous
`operation of the DRAM is achieved by decoupling the row
`registers from the DRAM array, thus allowing the DRAM
`cells to be precharged or refreshed during a read of the row
`register. Additionally, the row registers/memory cache is
`sized to contain a row of data of the DRAM array.
`Furthermore, a single column decoder addresses corre-
`sponding locations in both the memory cache and the
`DRAM array. And finally, all reads are only from the
`memory cache.
`
`36 Claims, 5 Drawing Sheets
`
`i
`
`4
`
`
`
`
`
`FROM 1 0F 4
`C
` R
`DECODE
`S81
`(4-bit
`3) sel
`a
`.
`YrO
`
`9 49 56-ad
`COLUMN
`B
`bit
`DECODER
`WRITEL BO
`AND
`
`&
`3
`:
`ROW
`Li
`ADDRESS
`OAD
`PDBu N/2
`-
`REGISTER
`M
`UX
`
`ri
`LATCH
`pauN/72|
`YW(O-n)(BUS)—
`62
`
`
`IN)BUS|
`4—bite)
`x
`
`7a
`
`RoW
`MU
`
`
`D2|L,to ouTPUT
`ENABLE
`
`
`ADDRESS
`EGAD 1/L0A
`
`
`REFRESH
`so
`728
`DATA BUFFERS ENABLE
`
`
`—ADDRESS
`ENABLE
`[|
`ROW ADDRESS
`CONTROL LOCIC
`
` REFRESH
`ADDRESS
`
`COUNTER
`REFRESH
`CONTROL
`
`ADDRESS
` (AgAyo) OF COLUMN
`
`
`
`
`
`
`
`
`
`
`
`
`Vervain Ex. 2017, p.1
`Micron v. Vervain
`IPR2021-01550
`
`Vervain Ex. 2017, p.1
`Micron v. Vervain
`IPR2021-01550
`
`
`
`5,721,862
`Page 2a
`
`U.S. PATENT DOCUMENTS
`
`Goodman and Chiang, “The Use of Static Column RAM as
`a Memory Hierarchy,” The 11th Annual Symposium on
`5,148,346—9/1992 Nakada cesscssssesresssosserees 365/189.03
`Computer Architecture,
`IEEE Computer Society Press,
`1/1993 Hidaka etal.
`we 395/425
`5,179,687
`
`
`1984, pp. 167-174.
`5,184,320—2/1993 Dye ..cscsssssscsssentenssesectensonsenoes 365/49
`
`- 365/189.07
`..
`5,184,325
`2/1993 Lipouski
`5/1993 Houston ...
`- 365/189.05
`5,214,610
`
`7/1993 Arimoto ..........
`> 365/189.04
`5,226,009
`
`.........ssssesees 395/425
`5,226,139
`7/1993 Fujishima et al.
`
`we 395/425
`7/1993 Fujishimaetal. ...
`5,226,147
`wee 395/449
`9/1993 Segers cers
`5,249,282
`
`4/1994 Hayano....
`»»» 365/230.03
`5,305,280
`
`7/1994 Diefendorff ..
`w» 365/189.05
`5,329,489
`
`1/1995 Lacey et al.
`.
`e+ 365/200
`5,381,370
`
`......
`w+ 395/405
`5,390,308
`2/1995 Ware et al.
`
`a» 395/445
`5,421,000
`5/1995 Fortinoetal.
`
`11/1995 Gonzales .....ccsesssescsersssnnsseesscane 395/403
`5,471,601
`OTHER PUBLICATIONS
`
`Dosaka, et al., “A 100MHz 4Mb Cobe Cache DRAM with
`Fast Copy-back Scheme,” Digest ofTechnical Papers, 1992
`TEEE International Solid-State Circuits Conference, pp.
`148-149 (Jun. 1992).
`
`Ohta, et al., “A IMb DRAM with 33Mhz Serial I/O Ports.”
`Digest ofTechnical Papers, 1986 IEEE International Solid-
`State Circuits Conference, pp. 274-275 (1986).
`
`Hitachi, “Video RAM,” Specification for parts HM53461
`and HM53462, pp. 30-33.
`
`Bursky, “Combination DRAM-SRAM RemovesSecondary
`Caches”, Electronic Design, vol. 40, No. 2, pp. 39-43 (Jan.
`23, 1992).
`
`Sartore, R.H., “New Generation of Fast, Enhanced DRAMS
`Replace Static RAM Caches in High-End PC Worksta-
`tions.” (Jul. 9, 1991).
`
`Vervain Ex. 2017, p.2
`Micron v. Vervain
`IPR2021-01550
`
`Vervain Ex. 2017, p.2
`Micron v. Vervain
`IPR2021-01550
`
`
`
`U.S. Patent
`
`Feb. 24, 1998
`
`Sheet 1 of 5
`
`5,721,862
`
`FIG. |
`PRIOR ART
`
`
`10
`
`14
`
`12
`
`STATIC RAM
`
`61
`
`CONTROLLER
`
`
`
`20f| P| 2
`
`
`
`4
`
`22
`
`26
`
`SLOW DRAM
`
`SLOW DRAM
`
` ENHANCED DRAM
`
`Vervain Ex. 2017, p.3
`Micron v. Vervain
`IPR2021-01550
`
`Vervain Ex. 2017, p.3
`Micron v. Vervain
`IPR2021-01550
`
`
`
`U.S. Patent
`
`Feb. 24, 1998
`
`(sy.q—y)SNa(.Lno)e
`
`VLVdLNdLno
`
`79
`
`$9
`
`Sudqdind
`IBvas(SHa-r)
`
`waqooad5+»JO|WOUd£Dla
`
`
`aaaooaa|MOUwvud"|sawv
`NWNT09ms—aSNAS
`oy:UALSIOIYAVUUV-ANs|-
`
`19ENOP|[LNOPS-9SSPOgrphsCOSp)oY09SrodYQ+LS
`
`SNa(NIHOLVYTSSAUCdV
`Lb SSayadv
`
`
`
`
`
`(sy.q~—r
`
`Jal
`
`daqooddMOU
`
`
`
`Sheet 2 of 5
`
`LAdLNOOL
`advNG
`addndVLVd
`2AVOT/IGVOT
`ssauddav
`
`Su
`
`aTaVNG
`
`MOU
`
`
`
`SSaYaadvNWNT09
`
`JTOULNOD
`
`1gssauadv
`
`HSaudad
`
`aIGVNG
`
`
`
`SSqadaqdqdvMOU
`
`5,721,862
`
` NWO109AO(Cly'Sy)SSauadv
`
`
`
`TOULNOD
`
`HSddddd
`
`Vervain Ex. 2017, p.4
`Micron v. Vervain
`IPR2021-01550
`
`JID0TTOYULNOO
`
`YaLNAOO
`
`HSddddy
`
`SsSduddayv
`
`Vervain Ex. 2017, p.4
`Micron v. Vervain
`IPR2021-01550
`
`
`
`
`
`
`
`
`
`
`
`
`
`U.S. Patent
`
`Feb. 24, 1998
`
`Sheet 3 of 5
`
`5,721,862
`
`SSaudadav
`
`ajavna_"*
`
`ugidooadMOUOLTIMWaLNN00
`TOULNODbk
`
`
`NWN'109(qauaddanann)(aqauadand)
`GLLOLaU
`
`OLSSauddavSSaudadqv
`
`
`SSIW/LIHSSaudavWutaMnow(-1¥9
`801d601=
`:Zh
`JID01IOULNODSSAUACGVMOU
`
`
`
`TOULNODKHU/MaPO_HOLVI|gSSauqav|WALSIOAYIVNOIS
`
`
`THAMO'UMOUHSddddy7)psauday
`
`ELtLINDYTOULNODTOULNOD
`rol001
`NOSIYVdWOD|=
`
`
`
`
`ssaudqv‘aad
`
`
`
`SSaduqqvMOU
`
`a1TGVNa
`
`G4
`
`dlad
`
`OL
`
`Ost
`
`Vervain Ex. 2017, p.5
`Micron v. Vervain
`IPR2021-01550
`
`Vervain Ex. 2017, p.5
`Micron v. Vervain
`IPR2021-01550
`
`
`
`
`
`U.S. Patent
`
`Feb. 24, 1998
`
`Sheet 4 of 5
`
`5,721,862
`
`
`
`
`
`130
`
`COLUMN
`ADDRESS |yoap
`
`
`CONTROL
`ROK 127) conuMN
`
`
`
`
`wok ‘126| READ/WRITE
`
`
`128] CONTROLLER J
`
`
`
`COLUMN
`KILL
`DETECTOR
`
`
`
`COLUMN ADDRESS CONTROL LOGIC
`
`FIG. 5
`
`@ouT
`ToumT1
`
`Toute
`
`1 Tout4
`
`LoOuT
`T ouT3
`
`Vervain Ex. 2017, p.6
`Micron v. Vervain
`IPR2021-01550
`
`Vervain Ex. 2017, p.6
`Micron v. Vervain
`IPR2021-01550
`
`
`
`USS. Patent
`
`Feb. 24, 1998
`
`Sheet 5 of 5
`
`5,721,862
`
`“Odd|Odd
`“TOO|MOU
`waqdoodd
`
`MOU
`
`9S
`
`AVUNVENSAVUUVANS|Odd|dAdoOod
`
`
`Wvudd‘VSWvuddMOU
`OrVYOV9S
`
`YaAdoodd|TOULNODSSAHaadV%MOUTOULNOOD
`YALSIOdY
`
`LOld
`
`AVAYVENS
`
`WVvud
`
`MGXBel
`
`geldotYdAqoodd
`
`MOU
`
`Vervain Ex. 2017, p.7
`Micron v. Vervain
`IPR2021-01550
`
`Vervain Ex. 2017, p.7
`Micron v. Vervain
`IPR2021-01550
`
`
`
`
`
`
`5,721,862
`
`1
`ENHANCED DRAM WITH SINGLE ROW
`SRAM CACHE FOR ALL DEVICE READ
`OPERATIONS
`
`This is a continuation of application Ser. No. 08/319.289,
`still pending filed on Oct. 6, 1994 which is a continuation-
`in-part of application Ser. No. 07/824,211, filed on Jan. 22,
`1992, and now abandoned.
`
`FIELD OF THE INVENTION
`
`The present invention relates to a dynamic random access
`memory (“DRAM”) and more particularly to an Enhanced.
`DRAM (which we call an “EDRAM”) with embedded
`registers to allow fast random access to the DRAM while
`decoupling the DRAM from data processing operations. The
`parent application, U.S. Pat. No. 07/824,211 filed Jan. 22,
`1992, is incorporated herein by reference.
`
`BACKGROUND OF THE INVENTION
`
`As the computer industry evolves, demands for memory
`haveout-pacedthe technology of available memory devices.
`Oneof these demandsis high speed memory compatibility.
`Thus, in a computer system, such as a personal computer or
`other computing system, memory subsystems have become
`an influential component toward the overall performance of
`the system. Emphasis is now on refining and improving
`memory devices that provide affordable, zero-wait-state
`operations.
`Generally, volatile memories are either DRAM orstatic
`RAM (“SRAM”). Each SRAM cell includes plural transis-
`tors. Typically the data stored in a SRAM cell is stored by
`the state of a flip-flop formed by some of the transistors. As
`long as power is supplied, the flip-flop keeps its data: it does
`not need refreshing. In a DRAM cell, on the other hand,
`there typically is one transistor, and data is stored in the form
`of charge on a capacitor that the transistor accesses. The
`capacitor dissipates its charge and needs to be refreshed.
`These two types of volatile memories have respective
`advantages and disadvantages. With respect to memory
`speed, the SRAM is faster than the DRAM due,partially at
`least, to the nature of the cells. The disadvantage, however,
`is that because there are more transistors,
`the SRAM
`memory is less dense than a DRAM of the same physical
`size. For instance, static RAMstraditionally have a maxi-
`mum. of one-fourth the number of cells of a DRAM which
`uses the same technology.
`While the DRAM has the advantage of smaller cells and
`thus higher cell density (and lower cost per bit), one disad-
`vantage is that the DRAM mustrefresh its memory cells
`whereas the SRAM does not. While the DRAM refreshes
`and precharges, access to the memory cells is prohibited.
`This creates an increase in access time, which drawback the
`static RAM does notsuffer.
`However, the speed and functionality of current DRAMS
`are often emphasized less than memory size (storage
`capacity) and cost. This is evidenced by the fact that DRAM
`storage capacity density has increased at a rate an order of
`magnitudegreater than its speed. While there has been some
`improvementin access time, systems using DRAMsgener-
`ally have had to achieve their speed elsewhere.
`In order to increase system speed, cache memory tech-
`niques have recently been applied to DRAM main memory.
`These approaches have generally been implemented on a
`circuit board level. That is. a cache memory is frequently a
`high-speed buffer interposed on the circuit board between
`
`2
`the processor chip and the main memory chip. While some
`efforts have been made by others to integrate a cache with
`DRAM,wefirst address the board level approach.
`FIG. 1 indicates a prior art configuration (board-level)
`wherein a processor chip 10 is configured with a cache
`controller 12 and a cache memory 14. The main purpose of
`the cache memory is to maintain frequently accessed data
`for high speed system access. Cache memory 14 (sometimes
`called “secondary cache static RAM”) is loaded via a
`multiplexer 16 from DRAMs 20, 22, 24 and 26.
`Subsequently, data is accessed at high speeds if stored in
`cache memory 14. If not, DRAMs20, 22, 24 and/or 26 load
`the sought data into cache memory 14. As seen in FIG. 1.
`cache memory 14 may comprise a SRAM,which is gener-
`ally faster than DRAMS 20-26.
`Various approaches have been proposed for cache
`memory implementation. These approachesinclude control-
`ling external cache memory by a controller, such as cache
`memory 14 and cache controller 12 in FIG. 1, or discrete
`proprietary logic. Notwithstanding its benefits, cache
`memory techniques complicate another major problem that
`exists in system design. Memory components and micro-
`processors are typically manufactured by different compa-
`nies. This requires the system designerto effectively bridge
`these elements, using such devices as the cache controller 12
`and the multiplexer 16 of FIG. 1. These bridge components
`are usually produced by other companies. The different pin
`configurations and timing requirements of these components
`makesinterfacing them with other devices difficult. Adding
`a cache memory that is manufactured by yet another com-
`pany creates further design problems, especially since there
`is no standard for cache implementation.
`Exacerbating the system design problems is the disadvan-
`tage that the use of external cache memory (such as cache
`memory 14) compromises the main storage access speed.
`There are mainly two reasonsfor this compromise. First, and
`most significant, the main storage access is withheld until a
`“cache miss” is realized. The penalty associated with this
`miss can represent up to two wait states for a 50 MHz
`system. This is in addition to the time required for a main
`memory access. Second, the prioritized treatment of physi-
`cal routing and buffers afforded the external cache is usually
`at the expense of the main memory data and address access
`path. Asillustrated in FIG. 1, data from DRAMs20, 22, 24
`and 26 can be accessed only through cache memory 14. The
`actual delay may be small, but adds up quickly.
`A third problem associated with separate cache and main
`memory is that the time for loading the cache memory from
`the main memory (“cachefill”) is dependent on the number
`of inputs to the cache memory from the main memory.Since
`the number of inputs to the cache memory from the main
`memory is usually substantially less than the numberofbits
`that the cache memory contains, the cachefill requires many
`clock cycles. This compromises the speed of the system.
`A memory architecture that has been used or suggested
`for video RAMs(“VRAMs”)is to integrate serial registers
`with a main memory. VRAMsare specific to video graphics
`applications. A VRAM may comprise a DRAM with high
`speed serial registers allowing an additional access port for
`a line of digital video data. The extra memory used here is
`known as a SAM (serially addressed memory), which is
`loaded using transfer cycles. The SAM’s data is output by
`using a serial clock. Hence, access to the registers is serial,
`not random. Also, there is continuous access to the DRAM
`so refresh is not an issue as itis in other DRAM applications.
`Another implementation that is expected to come to
`market in 1992 of on-chip cache memory will use a separate
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`45
`
`50
`
`535
`
`65
`
`Vervain Ex. 2017, p.8
`Micron v. Vervain
`IPR2021-01550
`
`Vervain Ex. 2017, p.8
`Micron v. Vervain
`IPR2021-01550
`
`
`
`5,721,862
`
`15
`
`20
`
`4
`3
`with even addresses is accessed, the odd memory block is
`cache and cache controller sub-system on the chip. It uses
`refreshed and vice-versa. This type of implementation
`full cache controllers and cache memory implementedin the
`requires more timing control restraints which translate to a
`same wayasit would be if external to the chip, i.e. a system
`approach. This approach is rather complicated and requires
`penalty in access time.
`a substantial increase in die size. Further,the loading time of
`Another type of problem arises when considering the type
`the cache memory from the main memory is constrained by
`of access modes to the main memory. One type of access is
`the use of input/output cache access ports that are substan-
`called page mode, in which several column addresses are
`tially fewer in number than the number of cache memory
`synchronously applied to an array after a row address has
`cells. A cache fill in such a manner takes many clock cycles,
`been received by the memory. The output data access time
`whereby system access speed suffers. Such an approach is,
`will be measured from the timing clock edge (where the
`in the inventors’ views, somewhat cumbersome and less
`column address is valid) to the appearance of the data at the
`efficient than the present invention.
`output.
`Still another problem in system design arises when the
`Another type of access modeis called static column mode
`system has both (a) interleaved memory devices together
`wherein the column addresses are input asynchronously.
`with (b) external cache memory. Interleaving assigns suc-
`Access can occur in these modes only when RASis active
`cessive memory locations to physically different memory
`(ow), and a prolonged time may be required in the prior art.
`devices, thereby increasing data access speed. Such inter-
`When manufacturing chips that support
`these access
`leaving is done for high-speed system access such as burst
`types, only one of these access types can be implemented
`modes. The added circuitry for cache control and main
`into the device. Usually, one of the last steps in the making
`memory multiplexing usually required by external cache
`of the memory chip will determine if it will support either
`memory creates design problems for effective interleaved
`type of access. Thus, memory chips made this way do not
`memory devices.
`offer both access modes. This induces an added expense in
`Another problem with the prior art arises when memory
`that the manufacturer must use two different processes to
`capacity is to increase. Adding more memory would involve
`manufacture the two types of chips.
`adding more external SRAM cache memory and more cache
`control logic. For example, doubling the memory size in
`To overcome these problems, small modifications added
`to a component, such as a DRAM,mayyield an increase in
`FIG. 1 requires not only more DRAM devices required, but
`also another multiplexer and possibly another cache con-
`system performanceand eliminate the need for any bridging
`components. To successfully integrate the modification with
`troller. This would obviously add to system power
`consumption, detract from system reliability, decrease sys-
`the component, however,its benefit must be relatively great
`tem density, add manufacturing costs and complicate system
`or require a small amount ofdie space. For example, DRAM
`yields must be kept above 50% to be considered producible.
`design.
`Yields can be directly correlated to die size. Therefore, any
`Another problem concerns the cost of manufacturing a
`modifications to a DRAM must take into account any die
`system with an acceptable cachehit probability. When using
`size changes.
`external cache memory, manufacturers allocate a certain
`amount of board area for the main memory. A smaller area
`In overcoming these problems, new DRAM designs have
`is allocated for the external cache. Usually,it is difficult to
`become significant. The greatest disadvantage to caching
`within DRAMshas been that DRAMsare too slow, The
`increase the main memory and the external cache memory
`while maintaining an acceptable cache hit probability. This
`present invention in one of its aspects seeks to change the
`limitation arises from the dedication of more board area for
`architecture of the DRAM to take full advantage of high
`me caching speed that may now be obtainable.
`the main memory than for external cache.
`One way to meet this challenge is to integrate the func-
`A further problem with system speed is the need for
`circuitry external to the main memory to write “post” data.—_tions of the main storage and cache. Embedding the cache
`Postdata refers to data latched in a device until it is needed.
`memory within localized groups of DRAM cells would take
`This is done because the timing requirement of the compo- ,, advantage of the chip’s layout. This placement reduces the
`nent needing the data does not synchronize with the com-—amount of wire (conductive leads) used in the chip which in
`ponent or system latching the data. This circuitry usually
`turn shortens data access times and reduces die size.
`causes timing delays for the component or system latching
`U.S. Pat. No. 5,025,421 to Chois entitled “Single Port
`the data.
`Dual RAM.”It discloses a cache with typical DRAM bit
`Asstated supra, access to the DRAM memory cells during 59 lines connected to typical SRAM bit lines through pass
`a precharge and refresh cycle was prohibitedin the priorart.
`gates. Reading and writing the SRAM and DRAM arrays
`Someprior art approaches have tried to hide the refresh in
`occurs via a single port, which requires that input/output
`order to allow access to DRAM data. One DRAM arrange-_—busses communicate with the DRAM bit lines by transmit-
`ment maintained the data output during arefresh cycle. The—_ting data through the SRAM bit lines. Using SRAM bitlines
`drawback of this arrangementwasthat only the lastread data 55 to access the DRAM array precludes any access other than
`was available during the refresh. No new data read cycle
`refresh to the DRAM array while the SRAM arrayis being
`could be executed during the refresh cycle.
`accessed, and conversely precludes access to the SRAM
`that
`A pseudo-static RAM is another arrangement
`array while the DRAM array is being accessed, unless the
`attempted to hide the refresh cycle. The device was capable
`data in the SRAM is the same data as in the currently
`of executing internal refresh cycles. However, any attempted 60 accessed DRAM row.This is a functional constraint thatis
`data access during the refresh cycle would extend the data
`disadvantageous.
`access time, in a worst case scenario, by a cycle time (refresh
`Moreover, the SRAM cells of Cho FIG. 1 are full SRAM
`cycle time plus read access time). This arrangementdid not
`cells, although his FIG. 4 may disclose using only a single
`allow true simultaneous access and refresh, but used a time
`latch (FF11) rather than an entire SRAM cell. However, the
`division multiplexing schemeto hide the refresh cycle.
`65 use of a single port with a simple latch raises a severe
`problem. Such an architecture lacks the ability to write data
`Another way to hide the refresh cycle is to interleave the
`into the DRAM without corrupting the data in the SRAM
`RAM memory on the chip. When a RAM memory block
`
`25
`
`35
`
`Vervain Ex. 2017, p.9
`Micron v. Vervain
`IPR2021-01550
`
`Vervain Ex. 2017, p.9
`Micron v. Vervain
`IPR2021-01550
`
`
`
`5,721,862
`
`10
`
`20
`
`25
`
`6
`this has led to complications, the art has thought that N-way
`association is the approach to follow.
`The present invention, according to one of its aspects.
`rejects this current thinking and instead provides a stream-
`lined architecture that not only includes on-chip cache
`control, but also operates so fast that the loss of N-way
`association is not a concern.
`Therefore, it is a general object of this invention to
`overcome the above-listed problems.
`Another object of the present invention is to isolate the
`cache memory data access operation from undesirable
`DRAM timing overhead operations, such as refresh and
`precharge.
`A further object of the present inventionis to eliminate the
`need for a external static RAM cache memory in high speed
`systems.
`Still another object of the present invention is to insure
`cache/main memory data coherency.
`Another object of this invention is to insure such data
`coherency in a fashion which minimizes overhead, so as to
`reduce any negative impact suchcircuitry might have on the
`random data access rate.
`
`5
`latch. Hence, the FIG. 4 configuration is clearly inferior to
`Cho’s FIG. 1 configuration.
`Another effort is revealed by U.S. Pat. No. 4,926,385 to
`Fujishima, Hidaka, et al., assigned to Mitsubishi, entitled,
`“Semiconductor Memory Device With Cache Memory
`Addressable By Block Within Each Column.” There are
`other patents along these lines by Fujishima and/or Hidaka.
`This one uses a row register like Cho FIG. 4. Two ports are
`used, but two decoders are called for. While this overcomes
`several of the problems of Cho,it requires a good deal more
`space consumed by the second column decoder and a second
`set of input/output switch circuitry. (Subsequent Fujishima/
`Hidaka patents have eliminated the second access port and
`second decoder and have reverted to the Cho FIG.
`1
`approach, despite its disadvantages.) Nevertheless, in this
`patent, the “tag” and data coherency controlcircuitry for the
`cache is external to the chip andis to be implemented by the
`customer as part of the system design. The “tag” refers to
`information about whatis in the cache at any given moment.
`A “hit”or “miss”indication is required to be generated in the
`system, external to the integrated circuit memory, and sup-
`plied to the chip. This leads to a complicated and slower
`system.
`SUMMARY OF THE PRESENT INVENTION
`Other Fujishima, Hidaka, et al. U.S. patents include U.S.
`Pat. Nos. 5,111,386; 5,179,687; and 5,226,139.
`The present
`invention provides a high-speed memory
`Arimoto U.S. Pat. No. 5,226,009 is entitled, “Semicon-
`device that is hybrid in its construction andis well-suited for
`ductor memory device supporting cache and method of
`use in high-speed processor-based systems. A preferred
`embodimentof the present invention embedsa set oftightly
`driving the same.” This detects whether a hit or miss occurs
`coupled row registers, usable for a static RAM function,in
`by using a CAM cell array. The basic arrangementis like the
`approach of Cho FIG.1 but modified to collect DRAM data oan high density DRAM,preferably on the very same chip as
`from an “interface driver,” which is a secondary DRAM
`the DRAM array (or subarrays). Preferably, the row registers
`sense amplifier, rather than from the primary DRAM sense
`_are located within or alongside the DRAM array, andif the
`amplifiers. This architecture still accesses the DRAM bit
`DRAM is configured with subarrays, then multiple sets of
`lines via the SRAM bit lines and is plagued with the single ,. row registers are provided for the multiple subarrays. pref-
`port problem. Circuitry is provided to preserve coherency~~erably one set of row registers for each subarray. Preferably
`between the DRAM and the SRAM.A setoftag registers is
`the row registers are oriented parallel to DRAM rows(word
`discussed with respect to a system-level (off-chip) imple-_lines), orthogonal to DRAM columns(bit lines). The row
`mentation in a prior art drawing. Arimoto implements his_—registers operate at high speed relative to the DRAM.
`on-chip cache tag circuitry using a content addressable ,, Preferably the number of registers is smaller than the
`memory array. That approach allows N-way mapping, which
`number of bit lines in the corresponding array or subarray.
`means that a group of memory devices in the cache canbe
`_In the preferred embodiment, one row register corresponds
`assigned to any row in any of N subarrays. For example, if
`to two DRAM bit line pairs, but in other applications, one
`an architecture is ““4-way associative,” this means that there—_register could be made to correspond to another number of
`are four SRAM blocks, any of which can be written to bya ,. DRAM bit
`line pairs. Preferably selection circuitry is
`DRAM.This methodresults in a large, expensive, and slow~included to select which of the several bit line pairs will be
`implementation of mapping circuitry. Using a CAM array_coupled (or decoupled) from the corresponding row register.
`for tag control has an advantage of allowing N-way asso-
`Preferably the row registers are directly mapped, ie. a
`ciation. However, the advantage of N-way association seems—_gne-way associative approach is preferred. Preferably the
`not to outweigh the disadvantage ofthe large and slow CAM 50 configuration permits extremely fast loading of the row
`array to support the N-way SRAM array.
`registers by connecting DRAM bit lines to the registers via
`pass gates which selectively couple and decouple bit lines
`Dye U.S. Pat. No. 5,184,320 is for a “Cached random
`(bit line pairs) to the corresponding row registers. Thus, by
`access memory device and system” and includes on-chip
`selecting which bit line pairs are to be given access to the
`cache control. The details of the actual circuitry are not
`row registers, the sense amplifiers for example drive the bit
`disclosed, however. This patent also is directed to N-way
`association and considerable complication is added to sup-
`lines to the voltages corresponding to the data states stored
`port this.
`in a decoded row of DRAM cells and this is loaded quickly
`Another piece of background art is Matick et al. U.S. Pat.
`into the row registers. Thus, a feature of the present inven-
`No. 4,577,293 for a “Distributed on-chip cache.” It has
`tion is a very quick cache fill.
`2-way associative cache implemented using a distributed ¢9—Thefastfill from the DRAM tothe rowregisters provides
`(on-pitch) set of master-slave row register pairs. Full flex-
`a very substantial advantage. In the case of a read miss,
`ibility of access is provided by dual ports that are not only
`mentioned below, a parallel load to the row registers is
`to the array but also to the chip itself. The two ports are
`executed, Thereafter, each read from the same row is a read
`totally independent, each having pins for full address input
`hit, which is executed at SRAM speeds rather than DRAM
`as well as data input/output. The cache control is on-chip.
`¢5 speeds.
`Preferably the row registers are connected to a unidirec-
`Thus it should be appreciated that the art has heretofore
`tional output (read) port, and preferably this is a high
`often directed efforts in achieving N-way association. While
`
`55
`
`Vervain Ex. 2017, p.10
`Micron v. Vervain
`IPR2021-01550
`
`Vervain Ex. 2017, p.10
`Micron v. Vervain
`IPR2021-01550
`
`
`
`5,721,862
`
`15
`
`10
`
`8
`7
`are not the same as the “last read row” for that particular
`in the preferred
`is,
`impedance arrangement. That
`DRAM block or subarray, the row register contents need not,
`embodiment, the registers are not connected to the source-
`and preferably will not, be overwritten. Moreover, changing
`drain path of the read port transistors, but instead they are
`rows during memory writes does not affect the contents of
`connected to gate electrodes thereof. This leads to improve-
`the row register until
`the row address specified writing
`ments in size and power.
`becomes the same as the “last read row.” This allows the
`The DRAM bit lines are preferably connected to a uni-
`system (during write misses) to return immediately to the
`directional input (write) port. In a circuit according to some
`row register which had been accessed just prior to the write
`aspects of the invention, the row registers can be decoupled
`operation. Write posting can be executed without external
`from the DRAM bit lines and data couldstill be inputted to
`data latches. Page mode memory writes can be accom-
`the DRAM bit lines via the write port. Moreover, even when
`plished within a single column address cycle time.
`the row registers are decoupled from the DRAM bit lines,
`Without initiating a major read or write cycle, the row
`data can be read from the row registers.
`registers can be read under column address control. It is
`preferred that the chip is activated and the output is enabled.
`Preferably both the read and write ports operate off one
`decoder.
`The toggling of the on-chip address latch by the user
`allowsthe preferred embodimentof the present invention to
`Theconfiguration of an integrated circuit memory accord-
`operate in either a page or static column mode. Further, the
`ing to a related aspect of the invention will not require an
`zero nano-second hold allows the /RE signal to be used to
`input/output data buss connected to the sense amplifiers,
`multiplex the row and column addresses.
`since each DRAM subarray will be located between its
`When a read hit occurs on an /RE initiated cycle, the
`corresponding set of row registers and the DRAM subar-
`ray’s corresponding set of sense amplifiers, and since the © internal row enable signal is not enabled and a DRAM
`data input and output functions are executed on the row
`access does not occur, thereby shortening the cycle time and
`register side.
`the precharge required.
`In addition to including row registers, preferably in a
`A novel and important aspect of the operation of such a
`directly mapped configuration, a circuit using the present ,, DRAM with embedded row registers is the provisi