`
`Vervain Ex. 2003, p. 1
`Micron v. Vervain
`IPR2021-01549
`
`
`
`Inside NAND Flash Memories
`
`WDC_V_0002919
`
`Vervain Ex. 2003, p. 2
`Micron v. Vervain
`IPR2021-01549
`
`
`
`Rino Micheloni • Luca Crippa • Alessia Marelli
`
`Inside NAND Flash
`Memories
`
`WDC_V_0002921
`
`Vervain Ex. 2003, p. 3
`Micron v. Vervain
`IPR2021-01549
`
`
`
`Rino Micheloni
`Integrated Device Technology
`Agrate Brianza
`Italy
`rino.micheloni@ieee.org
`
`Luca Crippa
`Forward Insights
`North York
`Canada
`luca.crippa@ieee.org
`
`Alessia Marelli
`Integrated Device Technology
`Agrate Brianza
`Italy
`alessiamarelli@gmail.com
`
`ISBN 978-90-481-9430-8
`DOI 10.1007/978-90-481-9431-5
`Springer Dordrecht Heidelberg London New York
`
`e-ISBN 978-90-481-9431-5
`
`Library of Congress Control Number: 2010931597
`
`© Springer Science+Business Media B.V. 2010
`No part of this work may be reproduced, stored in a retrieval system, or transmitted in any form or by
`any means, electronic, mechanical, photocopying, microfilming, recording or otherwise, without written
`permission from the Publisher, with the exception of any material supplied specifically for the purpose of
`being entered and executed on a computer system, for exclusive use by the purchaser of the work.
`
`C ov er desig n:
`
`
`eStudio Calamar S.L.
`
`Printed on acid-free paper
`
`Springer is part of Springer Science+Business Media (www.springer.com)
`
`WDC_V_0002922
`
`Vervain Ex. 2003, p. 4
`Micron v. Vervain
`IPR2021-01549
`
`
`
`2.2 NAND memory 21
`
`BL odd
`
`BL even
`
`Block 0
`
`Source Line (SL)
`
`Block 1
`
`D
`
`NAND
`string
`
`D
`
`NAND
`string
`
`S
`
`S
`
`S
`
`S
`
`NAND
`string
`
`NAND
`string
`
`D
`
`D
`
`D
`
`D
`
`NAND
`string
`
`NAND
`string
`
`NAND
`string
`
`DSL0
`
`WL0<63:0>
`
`SSL0
`
`DSL1
`
`WL1<63:0>
`
`SSL1
`
`Bitline (BL)
`
`D
`MDL
`
`MSL
`
`S
`
`DSL
`
`WL<63>
`
`WL<62>
`
`WL<2>
`
`WL<1>
`
`WL<0>
`
`SSL
`
`Wordlines
`
`Source Line (SL)
`
`Fig. 2.2. NAND string (left) and NAND array (right)
`
`S
`
`S
`
`
`
`Logical pages are made up by cells belonging to the same wordline. The
`number of pages per wordline is related to the storage capabilities of the memory
`cell. Depending on the number of storage levels, Flash memories are referred to in
`different ways: SLC memories store 1 bit per cell, MLC memories (Chap. 10)
`store 2 bits per cell, 8LC memories (Chap. 16) store 3 bits per cell and 16LC
`memories (Chap. 16) store 4 bits per cell.
`If we consider the SLC case with interleaved architecture (Chap. 8), even and
`odd cells form two different pages. For example, a SLC device with 4 kB page has
`a wordline of 65,536 cells.
`Of course, in the MLC case there are four pages as each cell stores one Least
`Significant Bit (LSB) and one Most Significant Bit (MSB). Therefore, we have:
`− MSB and LSB pages on even bitlines
`− MSB and LSB pages on odd bitlines
`
`
`WDC_V_0002949
`
`Vervain Ex. 2003, p. 5
`Micron v. Vervain
`IPR2021-01549
`
`
`
`22 2 NAND overview: from memory to systems
`
`All the NAND strings sharing the same group of wordlines are erased together,
`thus forming a Flash block. In Fig. 2.2 two blocks are shown: using a bus
`representation, one block is made up by WL0<63:0> while the other one includes
`WL1<63:0>.
`NAND Flash device is mainly composed by the memory array. Anyway, in
`order to perform read, program, and erase additional circuits are needed. Since the
`NAND die must be inserted in a package with a well-defined size, it is important
`to organize all the circuits and the array in the early design phase, i.e. it is
`important to define a floorplan.
`In Fig. 2.3 an example of a floorplan is given. The Memory Array can be split
`in different planes (two planes in Fig. 2.3). On the horizontal direction a Wordline
`is highlighted, while a Bitline is shown in the vertical direction.
`The Row Decoder is located between the planes: this circuit has the task of
`properly biasing all the wordlines belonging to the selected NAND string (Sect.
`2.2.2). All the bitlines are connected to sense amplifiers (Sense Amp). There could
`be one or more bitlines per sense amplifier; for details, please, refer to Chap. 8.
`The purpose of sense amplifiers is to convert the current sunk by the memory cell
`to a digital value. In the peripheral area there are charge pumps and voltage
`regulators (Chap. 11), logic circuits (Chap. 6), and redundancy structures (Chap.
`13). PADs are used to communicate with the external world.
`
`
`Peripheral Circuits
`
`Sense Amp
`
`Sense Amp
`
`Wordline
`
`Memory
`Array
`(Plane 1)
`
`Row Decoder
`
`Memory
`Array
`(Plane 0)
`
`Bitline
`
`Sense Amp
`Sense Amp
`Peripheral Circuits
`
`Fig. 2.3. NAND Flash memory floorplan
`
`PAD
`
`2.2.2 Basic operations
`
`This section briefly describes the basic NAND functionalities: read, program, and
`erase.
`
`WDC_V_0002950
`
`Vervain Ex. 2003, p. 6
`Micron v. Vervain
`IPR2021-01549
`
`
`
`2.2 NAND memory 23
`
`Read
`When we read a cell (Fig. 2.4), its gate is driven at VREAD (0 V), while the other
`cells are biased at VPASS,R (usually 4–5 V), so that they can act as pass-transistors,
`regardless the value of their threshold voltages. In fact, an erased Flash cell has a
`VTH smaller than 0 V; vice versa, a written cell has a positive VTH but, however,
`smaller than 4 V. In practice, biasing the gate of the selected cell with a voltage
`equal to 0 V, the series of all the cells will conduct current only if the addressed
`cell is erased.
`
`
`
`
`V1
`
` V2
`
` VDD
`
` MP
`
` MN
`
` CBL
`
`VBL
`
`VOUT
`
` ICELL
`
`Fig. 2.4. NAND string biasing during read and SLC VTH distributions
`
`
`
`String current is usually in the range of 100–200 nA. The read technique is
`based on charge integration, exploiting the bitline parasitic capacitor. This
`capacitor is precharged at a fixed value (usually 1–1.2 V): only if the cell is erased
`and sinks current, then the capacitor is discharged. Several circuits exist to detect
`the bitline parasitic capacitor state: the structure depicted in the inset of Fig. 2.4 is
`present in almost all solutions. The bitline parasitic capacitor is indicated with CBL
`while the NAND string is equivalent to a current generator.
`During the charge of the bitline, the gate of the PMOS transistor MP is kept
`grounded, while the gate of the NMOS transistor MN is kept at a fixed value V1.
`Typical value for V1 is around 2 V. At the end of the charge transient the bitline
`will have a voltage VBL:
`
`
`
`
`V
`
`BL
`
`V
`= 1
`
`−
`
`V
`THN
`
`(2.1)
`
`
`
`WDC_V_0002951
`
`Vervain Ex. 2003, p. 7
`Micron v. Vervain
`IPR2021-01549
`
`
`
`2.4 NAND-based systems 39
`
`All these issues cause a severe limitation to the maximum capacity of the card;
`in addition external components, like voltage regulators and quartz, cannot be
`used. In other words, the memory controller of the card has to implement all the
`required functions.
`The assembly stress for small form factors is quite high and, therefore, system
`testing is at the end of the production. Hence, production cost is higher (Chap. 15).
`
`
`
`
`PassivesPassives
`
`
`
`FlashFlash
`
`
`
`FlashFlash FlashFlash FlashFlash FlashFlash FlashFlash FlashFlash FlashFlash
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`FlashFlash FlashFlash FlashFlash FlashFlash FlashFlash FlashFlash FlashFlash FlashFlash
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`FlashFlash FlashFlash FlashFlash FlashFlash FlashFlash FlashFlash FlashFlash FlashFlash
`
`
`
`
`
`
`
`
`
`
`
`
`
`Microcontroller
`
`
`
`PassivesPassives
`
`Host Interface
`
`Host
`
`
`
`
`
`
`
`
`
`
`
`
`
`FlashFlash FlashFlash FlashFlash FlashFlash FlashFlash FlashFlash FlashFlash FlashFlash
`
`
`
`
`
`USER APPLICATION
`
`OPERATING SYSTEM
`
`Low Level Drivers
`
`Flash Card I/F (SD,MMC,CF, ...) or
`SSD I/F (SATA, PCIe,…)
`
`Fig. 2.32. Block diagram of a SSD
`
`HOST
`
`Flash Card - SSD
`
`MEMORY CONTROLLER
`
`HOST Interface (SD,MMC,CF, SATA, PCIe,…)
`
`FFS (FW)
`
`Wear Leveling (dynamic – static)
`
`Garbage Collection
`
`Bad Block Management
`
`ECC
`
`Flash Interface (I/F)
`
`0
`
`1
`
`Flash channels
`
`N
`
`NANDNAND
`
`NANDNAND
`
`NANDNAND
`
`
`
`
`
`Fig. 2.33. Functional representation of a Flash card (or SSD)
`
`WDC_V_0002967
`
`Vervain Ex. 2003, p. 8
`Micron v. Vervain
`IPR2021-01549
`
`
`
`40 2 NAND overview: from memory to systems
`
`For a more detailed description of Flash cards, please, refer to Chap. 17. SSDs
`are described in Chap. 18.
`Figure 2.33 shows a functional representation of a memory card or SSD: two
`types of components can be identified: the memory controller and the Flash memory
`components. Actual implementation may vary, but the functions described in the
`next sections are always present.
`
`2.4.1 Memory controller
`
`The aim of the memory controller is twofold:
`1. To provide the most suitable interface and protocol towards both the host and
`the Flash memories
`2. To efficiently handle data, maximizing transfer speed, data integrity and
`information retention
`In order to carry out such tasks, an application specific device is designed,
`embedding a standard processor – usually 8–16 bits – together with dedicated
`hardware to handle timing-critical tasks.
`For the sake of discussion, the memory controller can be divided into four
`parts, which are implemented either in hardware or in firmware. Proceeding from
`the host to the Flash, the first part is the host interface, which implements the
`required industry-standard protocol (MMC, SD, CF, etc.), thus ensuring both
`logical and electrical interoperability between Flash cards and hosts. This block is
`a mix of hardware – buffers, drivers, etc. – and firmware – command decoding
`performed by the embedded processor – which decodes the command sequence
`invoked by the host and handles the data flow to/from the Flash memories.
`The second part is the Flash File System (FFS) [6]: that is, the file system
`which enables the use of Flash cards, SSDs and USB sticks like magnetic disks.
`For instance, sequential memory access on a multitude of sub-sectors which
`constitute a file is organized by linked lists (stored on the Flash card itself) which
`are used by the host to build the File Allocation Table (FAT).
`The FFS is usually implemented in form of firmware inside the controller, each
`sub-layer performing a specific function. The main functions are: Wear leveling
`Management, Garbage Collection and Bad Block Management. For all these
`functions, tables are widely used in order to map sectors and pages from logical to
`physical (Flash Translation Layer or FTL) [7, 8], as shown in Fig. 2.34.
`The upper block row is the logical view of the memory, while the lower row is
`the physical one. From the host perspective, data are transparently written and
`overwritten inside a given logical sector: due to Flash limitations, overwrite on
`the same page is not possible, therefore a new page (sector) must be allocated in
`the physical block and the previous one is marked as invalid. It is clear that, at
`some point in time, the current physical block becomes full and therefore a second
`one (Buffer) is assigned to the same logical block.
`The required translation tables are always stored on the memory card itself,
`thus reducing the overall card capacity.
`
`WDC_V_0002968
`
`Vervain Ex. 2003, p. 9
`Micron v. Vervain
`IPR2021-01549
`
`
`
`2.4 NAND-based systems 41
`
`A A A A A
`
`A A A
`
`A A A A
`
`A A
`
`A A A
`
`A A
`
`A A
`
`A A
`
`A A
`
`A A
`
`A
`
`A A A A A
`
`A A A
`
`Logical Block
`
`Physical Buffer Block
`
`Physical Block
`
`A = Available
`
`Fig. 2.34. Logical to physical block management
`
`A
`
`
`
`Wear leveling
`Usually, not all the information stored within the same memory location change
`with the same frequency: some data are often updated while others remain always
`the same for a very long time – in the extreme case, for the whole life of the
`device. It’s clear that the blocks containing frequently-updated information are
`stressed with a large number of write/erase cycles, while the blocks containing
`information updated very rarely are much less stressed.
`In order to mitigate disturbs, it is important to keep the aging of each
`page/block as minimum and as uniform as possible: that is, the number of both
`read and program cycles applied to each page must be monitored. Furthermore,
`the maximum number of allowed program/erase cycles for a block (i.e. its
`endurance) should be considered: in case SLC NAND memories are used, this
`number is in the order of 100 k cycles, which is reduced to 10 k when MLC
`NAND memories are used.
`Wear Leveling techniques rely on the concept of logical to physical translation:
`that is, each time the host application requires updates to the same (logical) sector,
`the memory controller dynamically maps the sector onto a different (physical)
`sector, keeping track of the mapping either in a specific table or with pointers. The
`out-of-date copy of the sector is tagged as both invalid and eligible for erase. In
`this way, all the physical sectors are evenly used, thus keeping the aging under a
`reasonable value.
`Two kinds of approaches are possible: Dynamic Wear Leveling is normally
`used to follow up a user’s request of update for a sector; Static Wear Leveling can
`also be implemented, where every sector, even the least modified, is eligible for
`re-mapping as soon as its aging deviates from the average value.
`
`Garbage collection
`Both wear leveling techniques rely on the availability of free sectors that can be
`filled up with the updates: as soon as the number of free sectors falls below a
`given threshold, sectors are “compacted” and multiple, obsolete copies are deleted.
`
`WDC_V_0002969
`
`Vervain Ex. 2003, p. 10
`Micron v. Vervain
`IPR2021-01549
`
`
`
`42 2 NAND overview: from memory to systems
`
`This operation is performed by the Garbage Collection module, which selects the
`blocks containing the invalid sectors, copies the latest valid copy into free sectors
`and erases such blocks (Fig. 2.35).
`In order to minimize the impact on performance, garbage collection can be
`performed in background. The equilibrium generated by the wear leveling distributes
`wear out stress over the array rather than on single hot spots. Hence, the bigger the
`memory density, the lower the wear out per cell is.
`
`
`Sect<5>
`Sect<0>
`Sect<0>
`Sect<1>
`Sect<100>
`Sect<2>
`
`Sect<3>
`Sect<7>
`
`Block <n>
`Sect<0>
`Sect<1>
`Sect<2>
`Sect<7>
`Sect<100>
`Sect<3>
`Sect<6>
`Sect<99>
`Sect<5>
`Sect<9>
`Free
`Free
`
`Sect<5>
`Sect<100>
`Sect<3>
`Sect<6>
`Sect<99>
`Sect<99>
`
`Sect<5>
`Sect<9>
`
`Invalid Logic Sector
`Fig. 2.35. Garbage collection
`
`
`
`Bad block management
`No matter how smart the Wear Leveling algorithm is, an intrinsic limitation of
`NAND Flash memories is represented by the presence of so-called Bad Blocks
`(BB), i.e. blocks which contain one or more locations whose reliability is not
`guaranteed.
`The Bad Block Management (BBM) module creates and maintains a map of
`bad blocks, as shown in Fig. 2.36: this map is created during factory initialization
`of the memory card, thus containing the list of the bad blocks already present
`during the factory testing of the NAND Flash memory modules. Then it is updated
`during device lifetime whenever a block becomes bad.
`
`
`Logical Block
`
`Bad Physical Block
`
`Good Physical Block
`
`R = Reserved for future BB
`
`Fig. 2.36. Bad Block Management (BBM)
`
`R R
`
`
`
`WDC_V_0002970
`
`Vervain Ex. 2003, p. 11
`Micron v. Vervain
`IPR2021-01549
`
`
`
`2.4 NAND-based systems 43
`
`(2.3)
`
`kn
`−
`
`2
`
`≤⎟⎟
`
`ECC
`This task is typically executed by a specific hardware inside the memory
`controller. Examples of memories with embedded ECC are also reported [9–11].
`Most popular ECC codes, correcting more than one error, are Reed–Solomon and
`BCH [12]. While the encoding takes few controller cycles of latency, the decoding
`phase can take a large number of cycles and visibly reduce read performance as
`well as the memory response time at random access.
`There are different reasons why the read operation may fail (with a certain
`probability):
`• Noise (e.g. at the power rails)
`• VTH disturbances (read/write of neighbor cells)
`• Retention (leakage problems)
`The allowed probability of failed reads after correction is dependent on the use
`case of the application. Price sensitive consumer application, with a relative low
`number of read accesses during the product life time, can tolerate a higher
`probability of read failures as compared to high-end applications with a high
`number of memory accesses. The most demanding applications are cache modules
`for processors.
`The reliability that a memory can offer is its intrinsic error probability. This
`probability could not be the one that the user wishes. Through ECC it is possible
`to fill the discrepancy between the desired error probability and the error
`probability offered by the memory (Chap.14).
`The object of the theory of error correction codes is the addition of redundant
`terms to the message, such that, on reading, it is possible to detect the errors and to
`recover the message that has most probably been written.
`Methods of error correction are applied for purpose of data restoration at read
`access. Block code error correction is applied on sub-sectors of data. Depending
`on the used error correcting schemes, different amount of redundant bits called
`parity bits are needed.
`Between the length n of the code words, the number k of information bits and
`the number t of correctable errors, a relationship known as Hamming inequality
`exists, from which it is possible to compute the minimum number of parity bits:
`
`t
`∑
`
`i
`0
`=
`
`It is not always possible to reach this minimum number: the number of parity
`bits for a good code must be as near as possible to this number. On the other hand,
`the bigger the size of the sub-sector is, the lower the relative amount of spare area
`(for parity bits) is. Hence, there is an impact in Flash die size.
`BCH and Reed–Solomon codes have a very similar structure, but BCH codes
`require less parity bits and this is one of the reasons why they were preferred for
`an ECC embedded in the NAND memory [11].
`
`⎠⎞
`in
`⎝⎛
`
`⎜⎜
`
`
`
`WDC_V_0002971
`
`Vervain Ex. 2003, p. 12
`Micron v. Vervain
`IPR2021-01549
`
`