`
`IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. SC-21, NO. 5, OCTOBER 1986
`
`A 15-ns CMOS 64K RAM
`
`STANLEY
`
`MEMBEtt, IEEE, ROBERT
`A. CHAPPELL,
`MEMBER, IEEE, BARBARA
`E. SCHUSTER,
`J. LA1, MEMBER, IEEE,
`PAUL
`F. GREH3R, STEPHEN
`P. KLEPNER,
`FANG-SHI
`PETER W. COOK, MEMBER, IEEE, ROBERTA.
`LIPA, MEMBER, IEEE, REGINALD
`J. PERRY,
`WILLIAM
`F. POKORNY,
`AND MICHAEL
`A. ROBERGE
`
`L. FRANCH,
`
`Abstract —This paper describes a 64K CMOS RAM with an access
`time of 15 ns. The RAM was built using a technology with self-aligned
`TiSi2,
`sirtgle-level metaf, an a~erage minimum feature size of 1.35 pm,
`and a minimum effective channel
`length of L1 pm. An access of 10 ns is
`possible with the word line stitched on a second level of metaf and some
`minor
`redesign. High speed is achieved through innovative circuits and
`design concepts. New CMOS circuits
`include a sense-amp set signal
`generator, a row decoder, and an input circnit. These circuits feature use of
`CMOS devices to an advantage for high-speed safe operation. A layout-
`nde-independent
`graphics tool was used for the artwork design.
`
`I.
`
`INTRODUCTION
`
`T HE DRAMATIC
`
`place in
`is taking
`that
`reduction
`seen in the plot of
`access time can be clearly
`memory
`presented
`at
`the
`versus
`year
`for SRAM’S
`access
`time
`ISSCC [1]–[16]
`shown in Fig. 1. FET memories
`at high
`levels of
`integration
`have moved into the very high-perfor-
`mance
`area. This downward
`trend
`in access time should
`continue
`into
`the foreseeable
`future. At
`the 1984 ISSCC
`,we presented
`a 20-ns 64K NMOS design [5]. Also included
`on the plot
`is a 0.78X
`scaling of
`that design presented
`at
`the 1985 International
`Symposium on VLSI
`Technology,
`Systems, and Applications
`which gave access times as fast
`as 11 ns [10].
`In this paper we will describe a 64K CMOS
`RAM with measured
`access times
`of under
`15 ns and
`simulated
`access times
`of 10 ns with
`the addition
`of a
`second level of metal and some minor
`redesign.
`The characteristics
`of
`the 64K CMOS RAM are given in
`Table
`L The high speed of
`this CMOS RAM is due to a
`combination
`of
`technology
`and innovative
`CMOS periph-
`eral circuitry.
`After
`a brief description
`of
`the technology,
`three
`of
`the key
`circuits
`will
`be described:
`the sense-
`amplifier
`set generator,
`the row decoder,
`and the input
`circuit.
`each case,
`the advantageous
`use of CMOS
`In
`devices
`for high speed while maintaining
`low-power
`safe
`
`are with
`
`the General
`
`revised May 20, 1986.
`received May 5, 1986;
`Manuscript
`S. P.
`R. L. Franch,
`P. F. Greier,
`S. E. Schuster,
`B. A. Chappell,
`Klepner,
`and P. W. Cook are with the Research Dwision,
`IBM Corpora-
`tion, Yorktown
`Heights, NY 10598.
`Yorktown
`IBM Corporation,,
`F.-S. Lai was with the Research Division,
`Heights,
`NY 10598. He is now with the General Products Dnw.ion,
`IBM
`Corporation,
`San Jose, CA 95193.
`R. A. Lipa, W. F. Pokomy,
`and M. A. Roberge
`Technology
`Division,
`Essex Junction,
`VT 05452.
`R. J. Perry was. with the Generat Technology
`VT 05452. He is now the the Georgia
`Irtstitute
`GA 30332.
`IEEE Log Number
`
`Essex Junction,
`Division,
`of Technology,
`Atlanta,
`
`8610069.
`
`I 00
`
`80
`
`61S
`
`‘\
`
`40
`
`30
`
`20
`
`15
`
`(j
`
`10 -
`
`8
`
`6
`
`I
`~-d
`
`r
`
`1
`
`1
`
`1
`
`1
`
`0
`
`256K
`
`/
`v’\w
`
`~
`
`/’-
`
`64K
`
`ISSCC PAPERS
`
`64 K 0 NMOS
`l CMOS
`
`256 K v CMOS
`
`a
`
`“:
`
`‘
`
`+
`
`~,
`
`*
`
`1
`
`88
`
`90
`
`YEAR
`
`Fig.
`
`1.
`
`Plot
`
`of access time vemusucy~
`
`for SRAMS
`
`presented
`
`at
`
`the
`
`I
`TABLE
`64K CMOS RAM CHARACTERISTICS
`
`Organization
`
`64K (4K X 16)
`
`Cell Type/Area
`
`4-D NMOS1210WU2
`
`Access Time
`
`Cycle Tne
`
`supply
`
`15ns
`
`< 15 &
`
`5V
`
`this chip
`the use of
`In addition,
`will be featured.
`operation
`design
`physical
`to demonstrate
`a layout-rule-independent
`tool will
`be discussed.
`The use of a high-perforniance
`memory
`chip
`as a test vehicle
`served as a challenging
`demonstration
`of
`the potential
`of
`the tool.
`
`II.
`
`TECHNOLOGY
`
`straightforward
`using a relatively
`The RAM was built
`only a single level of metal
`[17].
`CMOS
`technology
`with
`Process parameters
`are given in Table 11. A cross section
`
`0018-9200/86/1000-0704$01.00
`
`01986 IEEE
`
`Page 1 of 9
`
` SAMSUNG EXHIBIT 1007
`
`
`
`SCHUSTER’et (d.: 15-NS CMOS 6L$K RAM
`
`705
`
`I
`
`I
`
`1
`
`I I,
`
`I
`
`I I
`
`I
`
`~cycle~
`
`,1
`
`I
`
`I
`
`hccess+
`
`I +
`
`----
`
`INPUTS
`
`OUTPUTS
`
`VALID
`
`II
`TABLE
`CMOS TECHNOLOGY
`
`II
`
`I
`I
`
`,
`
`“in.
`
`1
`
`1~
`
`+wle
`
`PRECliARGE
`
`~
`
`1
`
`INTERNAL
`
`SIGNAL
`
`Fig. 3. Waveforms
`
`showing 64K CMOS RAM operation.
`
`III.
`
`CHIP OPERATION
`
`TW2
`
`Chip
`
`operation,
`
`as
`
`shown
`
`in
`
`the waveforms
`
`of
`
`Fig.
`
`3,
`
`differs
`
`from
`
`the more
`
`conventional
`
`approaches
`
`which
`
`detection
`
`to initiate
`
`the
`
`timing
`
`chain.
`
`use
`
`In
`
`I
`
`P,OOO), O.003
`
`OHM-cm
`
`-1
`
`Fig.
`
`2.
`
`Cross
`
`section
`
`of
`
`the CMOSstructure
`
`(from
`
`[17]).
`
`structure
`the CMOS
`of
`features of
`the technology
`
`is shown
`include:
`
`in Fig.
`
`2. The main
`
`1)
`2)
`
`3)
`
`4)
`
`n-well;
`retrograde
`a l-MeVion-implanted
`junc-
`double
`diffused
`n ‘/n-
`arsenic–phosphorous
`devices to improve
`the drain
`tions for
`the n-channel
`breakdown
`voltage and hot-electron
`reliability;
`to
`a self-aligned
`TiS2 process with
`a nitride
`spacer
`reduce the sheet resistances of both polysilicon
`gates
`and diffusions;
`and
`layer g,rown on a very
`a 4-pm-thick
`p-type epitaxial
`heavily
`doped substrate to increase latch-up
`immun-
`ity.
`
`address-transition
`
`this
`
`design
`
`a cycle
`
`is
`
`initiated
`
`only
`
`by
`
`=
`
`falling.
`
`The
`
`inputs
`
`are
`
`sampled
`
`for
`
`a short
`
`period
`
`of
`
`time,
`
`then
`
`all
`
`inp@s,
`
`including
`
`~,
`
`are
`
`disconnected
`
`until
`
`output
`
`data
`
`have
`
`cycle time
`
`has
`chip
`the
`of
`precharge
`the
`and
`valid
`become
`For
`a
`than minimum
`cycle is shown.
`A longer
`begun.
`inputs would
`have to be valid when the
`cycle,
`minimum
`of
`the chip
`has begun
`and ‘ a new cycle
`is
`precharge
`as indicated
`on the figure. The approach
`offers a
`initiated,
`of advantages,
`listed below,
`that
`typically
`are not
`number
`available with more conventional
`approaches.
`1) The chip
`can be operated
`at minimum
`since inputs may be changed during
`an access.
`once
`2) The chip is insensitive
`to glitches on the inputs
`at
`the short sampling
`period
`the beginning
`of a cycle ends
`and the inputs
`are disconnected
`from the internal
`chip
`circuitry.
`state or
`in a valid
`latched
`are always
`outputs
`3) Data
`are in a high-impedance
`state, except when they are in
`transition.
`internal
`of
`4) Precharging
`tiated
`at
`the end of an access.
`5) The chip has the same cycle time for any combina-
`tion
`of, READ and WRITE operations
`even if data-in
`and
`data-out
`pins are shared.
`
`nodes
`
`is automatically
`
`ini-
`
`this chip was taken from a previous
`for
`The cell array
`64K NMOS
`design (see [5]
`for a cell
`layout
`drawing).
`It
`is
`a four-device
`cell
`to which resistors
`could be added on a
`second level of poly. The addition
`of resistors would make
`the cell
`fully
`static and would
`require
`ncl change in the
`physical
`or electrical
`design of
`the four-device
`portion
`of
`the cell. Thus the RAM performance
`would be unaffected
`by
`the addition
`of
`load
`resistors
`to the cell. The
`cell
`stability
`and soft error
`rate with and without
`load resistors
`were
`simulated
`using
`consemative
`assumptions
`and
`an
`analysis methodology
`that
`includes
`transient
`effects which
`are important
`whether
`dynamic
`storage or high-resistance
`loads
`are used [18]. The cell stability
`and soft error
`rate
`were found
`to be adequate
`for several
`important
`system
`applications
`without
`the addition
`of cell
`load resistors.
`
`IV.
`
`I@Y CIRCUITS
`
`circuitry was
`of new CMOS peripheral
`The development
`key to the high-speed
`access which was a major
`objective
`of
`the 64K CMOS RAM design. Fig. 4 shows a simplified
`block
`diagram of
`the access path, with the delay through
`each block
`indicated.
`In most
`of
`the access path,
`data
`simply
`ripple
`from block
`to block, with one block activat-
`ing the next
`one. Care was taken
`to achieve a uniform
`distribution
`of delay throughout
`the critical
`path. Three of
`the more important
`new circuits
`developed
`for
`this design
`will be described:
`the self-timed
`array and sense-amplifier
`circuitry;
`the row decoder which
`uses an innovative
`two-
`stage NOR and NAND decoder,
`and the address
`buffer
`
`Page 2 of 9
`
`
`
`706
`
`IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. SC-21, NO. 5, OCTOBER 1986
`
`1,
`
`WORDLINE
`
`1
`
`BIT LINES
`
`SET SIGNAL.
`GENERATO17
`
`%i- J+~ET
`“’%*~E~ +J!!+_J
`
`FS
`
`5r3/1
`
`11
`
`2/1 ‘5/1
`
`E;::::’NG<P
`
`,:,,
`
`COLUMN
`OECODER
`
`2/1
`
`5/1
`
`SA
`SENSE AMPLIFIER
`
`BIT
`SWITCHES {+
`1/0
`LINES
`
`[
`
`rr---------------:
`
`i,c1
`
`BIT
`
`SET
`
`5.3ns
`
`“’”’~’’::-.-,,!
`
`J
`
`g’”
`
`‘;;’
`
`DATA
`OUT
`SUFFER
`
`3.6ns
`
`QT
`
`RI
`STATE
`ORIVER
`
`DATA
`OUT
`
`1.9ns
`{
`
`ADR
`BUFFER
`
`+3
`
`1.5ns p
`
`CLOCK
`
`c%
`
`Fig.
`
`4.
`
`Simplified
`
`block
`
`diagram and delay
`cess path.
`
`of
`
`the 64K CMOS
`
`ac-
`
`uses a nonlinear
`which
`CMOS latch.
`
`front
`
`end and a self-referencing
`
`A. Sense-Amplifier
`
`Circuitry
`
`The sense amplifier
`ing:
`
`has several unique
`
`features
`
`includ-
`
`a)
`
`b)
`
`c)
`
`that
`
`has two
`
`setting waveform
`a sense-amplifier
`distinct
`slopes for a slow and fast set;
`is
`a technique
`for generating
`the setting signal so it
`timed
`for
`the accessed word
`line using p- and n-
`channel
`devices; and
`p-channel
`decoupling
`amplifier
`and the 1/0
`
`devices
`lines for
`
`the
`between
`faster setting.
`
`sense
`
`in
`design result
`the sense-amplifier
`of
`features
`The unique
`operation
`over
`both
`very high
`performance
`and reliable
`wide
`parameter
`variations.
`This
`has been confirmed
`by
`simulations
`and actual hardware
`results.
`are shown in Fig.
`The array and sense-amplifier
`circuitry
`5. During
`a READ or WRITE operation,
`a row and a column
`decoder will
`be selected. The selected
`row decoder will
`cause its associated word line to go high and the selected
`column
`decoder will
`turn
`on the gates of
`the n and p
`complementary
`parallel
`bit-switch
`devices. The use of dual
`bit switches
`is necessary to avoid threshold
`drops in propa-
`gating
`the cell signal
`from the bit
`lines to the 1/0
`lines
`during
`a READ or
`in propagating
`the signal
`in the reverse
`direction
`during
`a WRITE. Since the bit
`lines and 1/0
`lines
`are high at
`the start of a READ cycle,
`the p-channel
`device
`forms
`the best path for conducting
`the signal. When
`an
`1/0
`line is set to a low level during
`a WRITE,
`the n-channel
`bit
`switch
`provides
`the best path for discharging
`the bit
`line to a good low level.
`At
`the end of each word line is the sense-amp set signal
`generator
`circuit.
`As the selected word line rises,
`it
`turns
`on all
`the memory
`cells along its length
`and its set signal
`generator.
`A differential
`voltage
`builds
`up across the bit-
`line pairs as a result of
`the memory
`cells turning
`on. On
`one of
`the bit
`lines
`the differential
`voltage
`propagates
`
`Fig.
`
`5.
`
`Array
`
`and self-timed
`
`sense-amplifier
`
`circuitry.
`
`/
`lines and
`the 1,/0
`onto
`switch
`bit
`the selected
`through
`the sense-
`across
`voltage
`sense amplifier.
`As adequate
`amplifier
`nodes develops,
`the fast and slow signal
`from the
`set signal generator
`causes the sense amplifier
`to latch.
`The set signal generator
`of Fig. 5 is connected
`both to
`the +s~= line and the FS line. Prior
`to a word line rising,
`the ~s~~ line is precharged
`low and the ES line is pre-
`charged
`high. As the word line rises,
`the output
`(node A)
`of
`the first
`inverter
`stage of
`the set signal generator
`falls.
`Node
`A falling
`turns
`on the 10/1
`p-channel
`device
`3,
`which causes +s~~ to rise in its slow set mode of operation.
`A short
`time later
`the output
`(node 1?) of
`the second stage
`of
`the set signal
`generator
`will
`rise,
`causing
`n-channel
`device 6 to turn on, which in turn discharges
`the FS line to
`a low level. Device
`7 (a large
`50/1
`p-channel
`device)
`connects
`the FS and ~~~= lines. When
`the FS line dis-
`charges, device 7 turns on and causes @~~Tto rise in its fast
`set mode of operation.
`The slow and fast set slopes and the
`delays between
`them can be adjusted by changing
`the sizes
`of
`the devices in the set signal generator
`and device 7.
`In addition
`to the slow and fast set signal and self-tim-
`is
`ing from the accessed word line, high-speed
`operation
`de-
`further
`improved
`by the small p-channel
`decoupling
`vices between
`the small
`capacitance
`nodes of
`the sense
`amplifier
`( SA and SAN)
`and the high capacitance
`1/0
`lines. These decoupling
`devices make it possible
`to set
`the
`sense amplifier much faster
`for
`the same differential
`signal
`compared
`to a sense amplifier
`without
`decoupling
`devices.
`The SA and SAN nod;s
`are directly
`connected
`to the
`data-out
`buffer
`for
`further
`amplification
`before the signal
`is driven
`off-chip.
`are given in Fig. 6.
`waveforms
`Simulated
`sense-amplifier
`the ~s~~ signal
`can be clearly
`The two distinct
`slopes of
`in slope from slow to fast
`seen. The smooth
`transition
`occurs
`in conjunction
`with the increased
`differential
`volt-
`age build-up
`across the sense-amplifier
`nodes.
`It can also
`be seen that
`the small p-channel
`decoupling
`devices make
`it possible
`to set
`the sense amplifier
`as ~s~= rises without
`having
`to discharge
`the large bit-line
`or 1/0
`line capaci-
`tances.
`the design
`tlhat
`have demonstrated
`simulations
`Extensive
`performance
`the set signal generator
`results in reliable
`of
`even if
`there are substantial
`parameter
`variations.
`Since the
`set signal
`is generated
`from the selected word line, sensitiv-
`
`Page 3 of 9
`
`
`
`SCHUSTER et a[.: 15-NS CMOS 64K RAM
`
`707
`
`t
`
`I
`
`1
`
`I
`
`!
`
`I
`
`1
`
`~-J
`o
`
`TIME (ns)
`
`6
`
`Fig.
`
`6.
`
`sense-amplifier
`Simulated
`second level of metal
`
`assuming
`waveforms
`to stitch the word line.
`
`the use of a
`
`Fig,
`
`7.
`
`(LEFT
`
`ARRAY)
`
`wLN+I
`
`4J”
`
`Jwr
`
`t
`
`1
`
`-i
`
`SEM of array and sense-amplifier
`
`circuitry.
`
`‘
`
`‘-i 5rq ,tWLN
`
`( RIGHT ARRAY)
`
`-1
`
`c
`
`‘LN+I
`
`& A.
`
`,
`
`.
`
`A6~
`
`A5ii5
`
`AI AI
`
`A. ~.
`
`,
`
`LEAST SIGNIFICANT
`ADORESS BIT
`
`HIGHER ORDER ADDRESS
`BITS
`
`SIGNIFICANT
`LEAST
`AODRESS BIT
`
`Fig.
`
`8. Row decoder with two stages of decoding.
`
`the set
`to the path through
`skews is limited
`to timing
`ity
`the path
`signal
`generator
`to the sense amp relative -to
`through
`the array to the sense amp. Within
`these sensitive
`paths,
`timing
`variations
`due to parameter
`variations
`are
`limited
`by a number
`of compensating
`factors. The use of
`both p and n devices in the set generator
`and in the array
`signal path (n cell access device, p bit switch, p decoupling
`device)
`tends
`to compensate
`for
`shifts
`in p thresholds
`relative
`to n thresholds.
`The use of a double
`inversion
`in
`the set signal generator
`tends to compensate
`for shifts
`in
`the supply
`voltages.
`The relatively
`small
`clevice count
`in
`the
`set generator
`helps
`to contain
`sensitivity
`to errors
`between
`devices of
`the same type on the same chip. Errors
`due to on-chip
`variations
`in capacitances
`can be com-
`pensated
`by designing
`the FS line and the ~~~~ line to
`have capacitance
`components
`similar
`to those of
`the bit
`lines.
`is
`circuitry
`and sense-amplifier
`the array
`An SEM of
`the end of
`shown in Fig. 7. The set signal generators
`are at
`the word lines. As can be seen,
`the ~~~~ line and FS line
`run
`the entire
`length
`of
`the array. The sense-amplifier
`layout
`is symmetrical
`and balanced. This la;fout was gener-
`ated with the layout-rule-independent
`physl.cal design tool,
`and
`the
`symmetry
`was
`retained
`as layout
`rules were
`changed.
`
`B, Row Decoder
`
`in the
`The CMOS row decoder of Fig. 8 is a key block
`voltage
`access path.
`It
`is very fast while
`also minimizing
`overshoots
`and undershoots
`on the internal
`nodes of
`the
`decoder. Minimization
`of voltage
`overshoots
`and under-
`shoots was a critical
`factor
`in the choice of circuits
`during
`the design of
`the 64K CMOS RAM. Conventional
`CMOS
`decoder
`circuits
`with
`series connected
`devices
`can have
`internal
`nodes that may be capacitively
`coupled well below
`ground
`or above
`the power
`supply
`voltage. With
`this
`stacked
`device type of circuit,
`adjustment
`of physical
`de-
`sign and device sizes to damp the capacitive
`coupling may
`result
`in increased
`delay
`for
`decoder
`selection.
`In
`the
`decoder
`circuitry
`in this design, devices stacked more than
`two deep were not used. Also,
`stacking
`large numbers
`of
`devices was avoided
`elsewhere
`in the chip. As a conse-
`quence
`of
`this and other
`factors,
`voltage
`overshoots
`and
`undershoots,
`which
`could cause charge injection
`into
`the
`substrate
`and possibly
`trigger
`latch-up, were kept
`to under
`0.25 V on all
`internal
`nodes of
`the chip.
`The row decoder
`circuit
`has two stages of decoding. Th(
`first stage is a NOR decoder with the true or complement
`oi
`the higher order address bits as inputs. The second stage is
`a two-input
`NAND decoder with the output
`of
`the NOR as
`
`Page 4 of 9
`
`
`
`708
`
`IEEE JOURNAL
`
`OF SOLID-STATE
`
`CIRCUITS, VOL. SC-21, NO, 5, OCTOBER 1986
`
`WLN
`
`A
`
`T
`
`T
`
`!
`
`I
`
`T
`
`9
`
`8
`
`1
`
`B
`
`r
`
`6
`
`&r5
`
`‘$P-13
`
`A
`
`2
`
`c
`
`4
`
`r
`
`HIGHER ORDER
`ADDRESS BITS
`
`SIGNIFICANT
`LEAST
`ADDRESS
`BIT
`
`Fig,
`
`9,
`
`Simplified
`
`row decoder.
`
`the
`of
`the true or complement
`and either
`its inputs
`one of
`In
`(LSB)
`as the other
`input.
`address bit
`least
`significant
`the 64K CMOS design, since the decoders are in the center
`of
`the chip, a single NOR decoder can drive four word lines.
`The simplified
`row decoder
`circuit
`of Fig. 9 shows only
`a single word
`line to facilitate
`description
`of
`the circuit
`operation.
`In standby,
`all
`the address lines are low and the
`output
`(node A) of
`the NOR decoder
`is held to a high
`voltage
`through
`device 3. The word line is in the unselected
`low state, since the LSB is low, causing the NAND output
`(node B)
`to be high. The initiation
`of an access causes the
`precharge
`device 3 to be turned off and the address buffers
`to drive high either
`the true or complement
`of
`the address
`inputs
`to the decoder
`circuits. A word line is selected only
`if all
`the higher
`order
`address
`inputs
`to its NOR stage
`remain
`low and the LSB input
`to its NAND stage goes high.
`This
`results
`in the NOR decoder
`output
`(node A) staying
`low, and thus the
`high,
`the NAND output
`(node B)
`going
`word
`line going
`high. Following
`the selection
`of a word
`line and the setting of
`the sense amps,
`the selected NOR is
`discharged
`due to circuitry
`not shown on Fig. 8,
`thereby
`causing
`the selected word
`line
`to fall.
`The
`result
`is a
`well-controlled
`word-line
`pukewidth,
`independent
`the
`of
`cycle time that occurs in an actual application.
`At
`the end
`of an access, all address lines are returned
`to a low state
`and the precharge
`device 3 is turned on. Consequently,
`the
`dynamic
`storage
`time on the NOR decoder
`node is small
`and well controlled.
`state during
`remain in the unselected
`The word line will
`an access if
`the associated
`LSB remains
`low or
`if any of
`the higher order address inputs go high, causing node A to
`go low. Unselected
`word lines, and all word lines during
`standby,
`are actively
`held to ground. However,
`a momen-
`tary bounce
`on an unselected word line could occur
`if
`the
`NOR
`output
`(node
`~)
`did
`not
`discharge
`to
`a low
`level
`LSB turned
`on device
`6 in the NAND
`rising
`the
`before
`stage. Any possibility
`of an unselected word-line
`bounce is
`eliminated
`by providing
`two stages of delay of
`the LSB
`rising
`to the higher
`order address bits rising,
`as shown in
`Fig. 10. Address
`line skew is contained
`by careful physical
`placement
`of
`the address buffers
`and address lines, by use
`of
`identical
`layouts
`for all address buffers,
`and through
`design
`of
`the address buffer
`circuit
`(see Section
`IV-C).
`Even with the very conservative
`bounce protection
`delay of
`0.8 ns,
`the row decoder
`is still
`very fast, with
`a nominal
`delay from the higher order address bits rising to the word
`line rising of only 2.6 ns.
`
`vIIADR
`
`LSB
`
`t
`
`+
`
`LBB
`
`A;R
`
`Fig.
`
`10.
`
`Delay
`
`of
`
`address bit
`least significant
`dress bits.
`
`from higher
`
`order
`
`ad-
`
`c1
`
`ID
`
`Fig.
`
`11.
`
`Simplified
`
`input
`
`circuit
`
`and nonlinear
`
`voltage characteristic,
`
`C.
`
`Input Circuit
`
`addresses and data has
`of TTL
`input
`for
`The circuit
`and safe operation.
`It
`is
`high speed,
`low power dissipation,
`shown in a simplified
`version in Fig. 11, with the complete
`schematic
`shown in Fig. 12. Activated
`by the clock input
`falling,
`the circuit
`converts TTL
`levels to CMOS on-chip
`drive,
`latches
`the input
`state, and then disconnects
`the
`external
`input
`from the internal
`circuitry
`during
`an access.
`Following
`an access and the rise of
`the clock
`input,
`the
`circuit
`is designed
`to quickly
`precharge
`the internal
`nodes
`and
`the address
`lines
`for
`cycle time minimization.
`The
`power
`dissipation
`and delay
`skew as a function
`of TTL
`variations
`and
`device
`parameter
`variations
`is well
`con-
`tained
`by this circuit
`design, which also provides
`very high
`speed. The delay
`through
`the circuit
`from the rise of
`the
`clock
`input
`until
`the rise of
`the large capacitance
`address
`lines
`is only
`1.9 ns. As will be described
`in this section,
`CMOS devices are key to the high-speed
`safe operation
`of
`this circuit —especially
`as used in the two distinctive
`por-
`tions in Fig. 11:
`the nonlinear
`front end and the self-refer-
`encing latch.
`is the
`circuit
`input
`the high-speed
`of
`feature
`A salient
`nonlinear
`front
`end, which gives the voltage characteristic
`shown
`in Fig. 11. Because of
`the body-effected
`threshold
`voltage
`of p-channel
`device 2, a solid ground is provided
`at
`node B over
`the full
`range of
`low-input
`‘lTL
`signal
`levels.
`This
`can be seen in the voltage
`characteristic
`where the
`voltage
`at node B versus the voltage at node A is plotted.
`At
`the beginning
`of an access before
`the clock
`falls,
`for
`
`Page 5 of 9
`
`
`
`SCHUSTER et ai.:
`
`15-INS CMOS 64K RAM
`
`709
`
`‘“f--$+-
`
`ADR
`
`-r
`
`Fig.
`
`12.
`
`Input
`
`circuit with nonlinear
`
`front end.
`
`device is cut off
`less than 1.8 V the input
`voltages
`input
`and devices 9 and 10, shown on Fig. 12, hold node B to
`ground, Very small devices can be used in the inverter
`that
`drives
`device
`10, so that power
`dissipatiorl
`due to inter-
`mediate
`voltages on node A (input
`to the inverter)
`is small.
`If node B is at ground
`as the clock falls,
`the latch sets with
`node D high and with no steady-state
`power dissipation.
`If
`the input
`voltage
`prior
`to latch activation
`is greater
`than
`1.8 V,
`the small
`capacitance
`on node B will
`be quickly
`charged
`high through
`the input
`devices and the latch will
`set with node C high, causing input
`device 2 to be turned
`off and device 8 to be, turned on. This will
`result
`in a good
`high being
`provided
`on node B,
`thereby
`cutting
`off any
`momentary
`power dissipation
`in the latch.
`circuit
`the input
`As shown on the complete
`schematic of
`Fig. 12,
`the other circuitry
`driving
`the devices connected
`to
`node B served to limit
`the overshoot
`and undershoot
`of
`nodes
`to s 0.25 V and isolate the latch during
`an access,
`so the address inputs
`can be set up for
`the next access. As
`shown in Fig. 12, die gate of device 2 is switchable,
`being
`controlled
`by the levels of ADR and ADR. At
`the start of
`a cycle both ADR and ADR are low and the gate of device
`2 is held at ground. Once the input
`circuit
`is activated
`and
`the address
`output
`becomes
`valid
`( ADR or ADR goes
`high),
`the voltage
`level on the gate of device 2 goes high,
`turning
`device 2 off and disconnecting
`the external
`input
`until
`the latch is reset by the clock rising at
`the end of an
`access.
`circuit
`CMOS latch in the input
`The self-referencing
`key to providing
`high speed,
`low power,
`safe operation.
`Referring
`to Fig. 11, a balanced
`physical
`design is used, so
`that p-channel
`devices 5 and 6 are well matched
`to each
`other,
`as are the n-channel
`latch devices 3 and 4. At
`the
`beginning
`of an access, the clock input
`is high and nodes C
`and D are low,
`resulting
`in both devices 3 and 4 being cut
`off. Shortly,
`after
`the beginning
`of an access,
`the clock
`input will
`fall,
`turning
`on p-channel
`device 7 and charging
`nodes C and D until
`the n-channel
`latch
`sets in the
`direction
`determined
`by which
`of
`the p-channel
`steering
`devices
`5 and 6 is most conductive.
`If node B has been
`
`is
`
`devices 1 and 2, p-channel
`the input
`high through
`charged
`5 will
`be much more conductive
`than
`p-channel
`device
`device 6, steering
`the setting of
`the n-channel
`latch so that
`node C goes high with
`negligible
`variation
`in delay as a
`function
`of variation
`in the TTL
`high level.
`If node B is
`low,
`then initially
`devices 5 and 6 will be etqually conduc-
`tive. However,
`as nodes C and D both
`begin to charge
`high,
`device
`5 will
`quickly
`become
`less conductive
`than
`device 6,
`thereby
`steering the setting
`of
`the latch so that
`node D goes high.
`If
`the nonlinear
`front end were not used
`to provide
`a good ground ‘at node B for any external
`input
`voltage
`less than
`1.8 V,
`then the worst-case TTL
`low of
`0.8 V would
`result
`in substantially
`longer
`delay
`through
`the input
`circuit
`relative
`to the delay for a good TTL high.
`However,
`the self-referencing
`CMOS latch
`used in con-
`junction
`with
`the nonlinear
`front
`end results
`in good
`control
`of delay and power dissipation
`variations
`for
`the
`full
`range of TTL
`input
`levels, while also providing
`high-
`speed operation.
`
`V.
`
`RESULTS
`
`using N 2 test
`tested
`has been extensively
`chip
`The
`on all 16 data
`Functionality
`has been measured
`patterns.
`and outputs
`for a power
`supply
`voltage
`of 3–6 V.
`inputs
`chip
`is also operational
`with
`a + 10-percent
`power
`The
`supply
`variation
`over
`the full
`range of TTL
`input
`levels.
`Waveforms
`showing
`a CS’ access time of under 15 ns are
`shown in Fig. 13. The delays for each block
`in the critical
`path are given on the block diagram of Fig. 4. Simulations
`assuming
`a second level of metal
`to stitch the word line,
`1.O-pm effective
`channel
`lengths
`for
`the n and p devices,
`and some minor
`redesign
`give a nominal
`access time of
`10 ns.
`
`VI.
`
`PHYSICAL DESIGN
`
`for
`tool
`graphics
`The use of a ground-rule-independent
`design is
`the high-performance
`64K CMOS RAM artwork
`unique.
`The graphics
`tool enables timely
`accommodation
`
`Page 6 of 9
`
`
`
`710
`
`IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. SC-21, NO, 5, OCTOBER 1986
`
`Fig.
`
`13.
`
`C,S access time waveform,
`
`(b)
`has a minimum
`of Fig. 14 which
`buffer
`of address
`(a) Plot
`15.
`Fig,
`(b) Plot of same address buffer with
`length
`of 1.1 pm.
`channel
`effective
`the effective
`channel
`length being reduced
`to 0.7pm along with several
`other
`ground
`rule changes.
`
`Fig. 14.
`
`SEM of row address buffer with nonlinear
`
`front end
`
`in
`
`the design cycle.
`rule changes during
`layout
`of physical
`are contained
`in a file which
`can be
`rules
`The
`layout
`changed
`or updated.
`Physical
`layout
`rule violations
`are
`virtually
`eliminated
`by the use of
`this graphics
`tool. The
`algorithm
`used
`the
`approach
`has
`been
`described
`elsewhere
`[19].
`from existing
`chip was derived
`this
`for
`The
`artwork
`artwork
`for a chip previously
`designed
`using the ground-
`rule-independent
`tool. To make this conversion,
`-90
`per-
`cent of
`the ground
`rules changed. Roughly
`six man weeks
`of
`time were needed for
`the conversion.
`Since the tool was
`then in an early
`stage of development,
`it
`is felt
`that
`this
`time could be reduced by possibly
`an order of magnitude.
`In addition,
`the tool
`has been used to generate
`several
`versions
`of
`the
`chip
`for
`various
`process
`development
`of
`vehicles.
`The version
`the chip described
`here used the
`preexisting
`nonoptimized
`pad cage used by all versions of
`the design.
`the ground-
`of
`the potential
`illustrate
`An example will
`rule-independent
`layout
`tool. The SEM of Fig. 14 shows
`the previously
`described
`input
`circuit
`used as a row ad-
`dress buffer.
`The same address buffer
`shown in Fig. 15(a)
`has an effective
`channel
`length
`of 1.1 pm and an area of
`19321 pm2.
`In Fig. 15(b) several ground rules were changed
`including
`a change in effective
`channel
`length
`to 0.7 pm.
`For
`the same device width-to-length
`ratios the area reduces
`of
`to 13689 pm2. An examination
`the aspect
`ratio of
`the
`two plots
`clearly
`reveals a more complicated
`transforma-
`tion than a simple
`scaling. The total
`real
`time to generate
`
`Fig,
`
`16.
`
`SEM of
`
`the 64K RAM,
`
`30 s.
`the new
`about
`was
`data
`graphics
`the 64K CMOS RAM chip is shown in Fig.
`An SEM of
`16. The chip has a single level of metal. Physically
`the chip
`is divided
`into
`four
`16K quadrants
`with
`the row and
`column
`decoders
`in the center. Even with a silicided word
`line
`that
`runs only
`halfway
`across the array and a split
`word-line
`cell
`[5],
`the RC delay
`for a signal propagating
`down
`the word
`line
`is approximately
`3 ns. Each
`16K
`quadrant
`has four
`sense amplifiers
`and data-in
`buffers
`located
`on its periphery.
`
`VII.
`
`SUMMARY
`
`A 64K CMOS RAM with an access time of 15 ns has
`been described.
`The RAM was built using a single level of
`metal, an average minimum feature size of 1.35 pm, and an
`effective
`channel
`length
`of 1.1 and 1.2 pm for n- and
`p-channel
`devices,
`respectively.
`An access time of 10 ns is
`possible with
`the word line stitched
`on a second level of
`metal,
`an effective
`channel
`length
`of 1.0 pm, and some
`minor
`redesign.
`High
`speed has been achieved
`through
`innovative
`circuits
`and design
`concepts.
`A layout-rule-
`independent
`graphics
`tool was used for
`the artwork
`design.
`
`Page 7 of 9
`
`
`
`SCHUSTER et al.: 15-NS CMOS 64K tt.4M
`
`711
`
`ACKNOWLEDGMENT
`
`K.
`by L. Terman,
`provided
`and direction
`The support
`and the contribution
`of L.
`Beilstein,
`and F. Weidman,
`Terman
`and R. V. Rajeevakumar
`to the decoder circuit
`are
`appreciated.
`The authors are also indebted
`to the Yorktown
`Research
`Silicon
`Facility
`for CMOS
`prc~cessing of
`the
`chips.
`
`WFEfLt3NCES
`
`and logic
`
`chips developed
`
`Robert L. Franch received
`degree
`the B. S.E.E.
`from the Polytechnic
`Institute
`of New York,
`Brooklyn,
`in 1980.
`in a
`IBM, East Fishkill, NY,
`In 1980 he joined
`he
`Bipolar
`Device
`Reliability
`Group,
`where
`worked
`on accelerated
`life
`testing
`of bipolar
`memory,
`logic, and test vehicle chips,
`In 1984, he
`joined
`IBM Research in Yorktown
`Heights, NY,
`as a Member
`of
`the
`Test
`Systems Group.
`He has since been
`engaged
`in the functional
`testing
`of NMOS,
`CMOS,
`and bipolar memory
`at ~BM Research.
`
`[1]
`
`[2]
`
`[3]
`
`[4]
`
`[5]
`
`[6]
`
`[7]
`
`[8]
`
`[9]
`
`[10]
`
`[11]
`
`[12]
`
`[13]
`
`[14]
`
`[15]
`
`[16]
`
`[17]
`
`[18]
`
`[19]
`
`in ISSCC
`
`Dig. Tech.
`
`in ISSCC Dig,
`
`static RAM,”
`
`in ISSCC Dig.
`
`in. ISSCC Dig. Tech.
`
`RAM,”
`
`in ISSCC
`
`Dig.
`
`SRAM,”
`
`in .ISSCC Dig.
`
`T. Ohzone
`static RAM,”
`“A 64Kb
`et al.,
`Papers, Feb. 1980, pp. 236-237.
`64K static RAM,>’
`A. V. Ebel
`et al.,
`“An NMOS
`Tech. Papers, Feb. 1982, pp. 254-255.
`K. Tanimoto,
`“A 64K X 1 bit NMOS
`Tech. Pu~ers, Feb. 1983, DD. 66-67,
`M.
`Isobe=et al.,
`“A 46ns ~56K CMOS RAM,”
`Papers, Feb. 1984, pp. 214-215.
`S. Schuster
`el al.,
`‘<A 20ns 64K NMOS
`Tech. Papers, Feb. 1984, pp. 226-227,
`0. Minato
`et al.,
`“A 20ns 64K CMOS
`Tech. Papers, Feb. 1984, pp. 222-223.
`S. Yamamoto
`et al.,
`“A 256K CMOS SRAM with variable-imped-
`ance loads,”
`in ISSCC Dig. Tech. Pa ers, Feb. 1985, pp. 58–59.
`C!
`H. Shinohara
`er af.,
`“A 45NS 256k
`MOS SCRAM with
`tri-level
`in ISSCC Dig. Tech. Papers, Feb. 1985, pp. 62–63.
`word line,”
`K. Ochii
`et al.,
`“A 17ns CMOS RAM with Schmitt
`trigger
`sense
`amplifier,”
`in ISSCC Dig. Tech. Papers, Feb. 1985, pp. 64–65.
`S. E. Schuster
`“An
`llns
`64K (4K16) NMOS RAM,”
`in Int,
`et a/.,
`S.vmp. VLSI Techno[., Systems and Apphcations,
`Proc. Tech. Papers,
`May 1985, pp. 24-28.
`N. Okazaki
`et al..
`“A 30ns 256K full CMOS SRAM.”
`Dig. Tech. Pavers”. Feb. 1986. m, 204-205.
`e’tal,:
`K.”Ichinose
`X 4 CMOS
`“ 25ns 256~x
`l/64K
`ISSCC Dig. Tech. Papers, Feb. 1986, pp. 248-249.
`M. Honda
`et al.,
`“A 25ns 256K CMOS RAM,”
`Tech. Paoers, Feb. 1986, DD, 250–251.
`S. E. Schfister
`et al.,
`“A’i$
`ns CMOS 64K RAIVfZ’
`Tech. Papers, Feb. 1986, pp. 206-207,
`S. T. Flannagan
`et al.,
`“ Two 64K CMOS SRAM’s with 13ns access
`time,”
`in ISSCC Dig. Tech. Papers, Feb. 1986, pp. 208–209.
`K. Ogiue
`et al,,
`“A 13ns/500mW
`64Kb ECL RAM,”
`in ISSCC
`Dig. Tech. Papers, Feb. 1986, pp. 212-213.
`F, S. Lai
`1 pm CMOS technology
`“A highly
`latchup-immune
`et al.,
`and self-aligned
`TiSi2 ,“
`fabricated
`with 1 MeV ion implantation
`IEDM Dig. Tech. Papers, Dec. 1985, pp. 513–516.
`of static RA-M
`B. A. Chappell
`et al.,
`“Stability
`and SER analysis
`cells,”
`IEEE Trans. Electron Devices, vol. ED-32, no. 2, pp. 463–470,
`Feb. 1985.
`“Modified
`P. W. Cook,
`IBM J. Res, Develop.,
`
`in ISSCC
`
`SRAM’S,”
`
`in
`
`in ISSCC
`
`Dig.
`
`in ISSCC Dig.
`
`in
`
`algorithm for mixed constraints,”
`relaxation
`vol. 28, no. 5, pp. 581-589,
`Sept. 1984.
`
`Stanley E. Schuster
`see this issue, p. 604.
`
`(S’61-M65),
`
`for photograph
`
`and biography
`
`please
`
`Barbara A. Chappell (M85)
`received the B. S.E.E.
`of Portland,
`Portland,
`degree from the University
`OR,
`in 1977 and the M. S.E.E. degree from the
`University
`of Catifomia
`at Berkeley
`in 1981.
`In 1978 she joined
`IBM at
`the T. J. Watson
`Research Center, Yorktown
`Heights, NY, where
`she is currently
`a Research Staff Member, work-
`ing primarily
`in the field of MOS circuit
`design.
`Her
`previous
`employment
`included
`ten
`years
`with
`the Custom IC Department
`at Tektronix,
`Beaverton, OR.
`
`Paul F. Greier
`in
`degree
`the B.S.
`received
`mathematics
`from Mercy College
`in 1980 and
`the M.S, degree in computer
`science from the
`Polytechnic
`Institute
`of New York, Brooklyn,
`in
`1982.
`J. Watson
`the Thomas
`IBM at
`He joined
`NY,
`in
`Research
`Center,
`Yorktown
`Heights,
`1965, and was engaged in digital
`interface
`logic
`design
`and laboratory
`automation.
`He moved
`to systems
`programming,
`developing
`hi-synch
`host
`communications
`software
`for
`distributed
`systems. He has been invol~ed
`in automated
`testing since
`data-acquisition
`developing
`test software
`for Josephson
`devices
`in 1982-1983,
`and has
`been responsible
`for
`the functional
`testing of VLSI memories
`and logic in
`the Semiconductor
`and Science Technology
`Department
`since 1984.
`
`Stephen P. Klepner was born in New York, NY,
`on April
`15, 1942. He received
`the B, S.E.P. and
`Ph.D.
`degrees from New York Univer