throbber
(19)
`
`(12)
`
`(cid:6)(cid:27)&(cid:11)(cid:11)(cid:12)(cid:19) (cid:11)(cid:14)(cid:11)(cid:20)(cid:24)(cid:12)(cid:6)
`EP 1 820 309 B1
`
`(11)
`
`EUROPEAN PATENT SPECIFICATION
`(51) Int Cl.:(cid:3)
`H04L12/56(2006.01)
`
`(45) Date of publication and mention
`of the grant of the patent:
`27.08.2008 Bulletin 2008/35
`
`(21) Application number: 05850071.1
`
`(22) Date of filing: 30.11.2005
`
`(54) STREAMING MEMORY CONTROLLER
`STREAMING-(cid:3)SPEICHERSTEUERUNG
`CONTROLEUR DE MEMOIRE EN CONTINU
`
`(84) Designated Contracting States:
`AT BE BG CH CY CZ DE DK EE ES FI FR GB GR
`HU IE IS IT LI LT LU LV MC NL PL PT RO SE SI
`SK TR
`Designated Extension States:
`AL BA HR MK YU
`
`(30) Priority: 03.12.2004 EP 04106274
`
`(43) Date of publication of application:
`22.08.2007 Bulletin 2007/34
`(73) Proprietor: Koninklijke Philips Electronics N.V.(cid:3)
`5621 BA Eindhoven (NL)(cid:3)
`
`(72) Inventors:
`• BURCHARD, Artur
`NL-(cid:3)5656 AA Eindhoven (NL)(cid:3)
`
`(86) International application number:
`PCT/IB2005/053970
`
`(87) International publication number:
`WO 2006/059283 (08.06.2006 Gazette 2006/23)(cid:3)
`
`• HEKSTRA-(cid:3)NOWACKA, Ewa
`NL-(cid:3)5656 AA Eindhoven (NL)(cid:3)
`• HARMSZE, Francoise, J.(cid:3)
`NL-(cid:3)5656 AA Eindhoven (NL)(cid:3)
`• VAN DEN HAMER, Peter
`NL-(cid:3)5656 AA Eindhoven (NL)(cid:3)
`
`(74) Representative: van der Veer, Johannis Leendert
`et al
`NXP Semiconductors B.V.
`IP&L Department
`High Tech Campus 32
`5656 AE Eindhoven (NL)(cid:3)
`
`(56) References cited:
`US-(cid:3)A- 5 751 951
`US-(cid:3)B1- 6 405 256
`
`US-(cid:3)A1- 2002 034 162
`
`Note: Within nine months of the publication of the mention of the grant of the European patent in the European Patent
`Bulletin, any person may give notice to the European Patent Office of opposition to that patent, in accordance with the
`Implementing Regulations. Notice of opposition shall not be deemed to have been filed until the opposition fee has been
`paid. (Art. 99(1) European Patent Convention).
`
`Printed by Jouve, 75001 PARIS (FR)
`
`EP1 820 309B1
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2172, p. 1
`
`

`

`EP 1 820 309 B1
`
`Description
`(cid:3)[0001] The present invention relates to a memory controller and a method for coupling a network and a memory.
`(cid:3)[0002] The complexity of advanced mobile and portable devices increases. The ever more demanding applications
`of such devices, the complexity, flexibility and programmability requirements intensify data exchange inside the devices.
`The devices implementing such applications often consist of several functions or processing blocks, here called sub-
`systems. These subsystems typically are implemented as separate ICs, each having a different internal architecture
`that consists of local processors, busses, and memories, etc. Alternatively, various subsystems, may be integrated on
`an IC. At system level, these subsystems communicate with each other via a top-(cid:3)level interconnect, that provides certain
`services, often with real-(cid:3)time support. As an example of subsystems in a mobile phone architecture we can have, among
`others, base-(cid:3)band processor, display, media processor, or storage element. For support of multimedia applications,
`these subsystems exchange most of the data in a streamed manner. As an example of data streaming, reference is
`made to readout of an MP3 encoded audio file from the local storage by a media-(cid:3)processor and sending the decoded
`stream to speakers. Fig. 1 shows a basic representation of such a communication, which can be described as a graph
`of processes P1 -P4 connected via FIFO buffers B. Such an representation is often referred to as Kahn process network.
`The Kahn process network can be mapped on the system architecture, as described in E.A. de Kock et al., "YAPI:
`Application modeling for signal processing systems". In Proc. of the 37th. Design Automation Conference, Los Angeles,
`CA, June 2000, pages 402-405. IEEE, 2000. In such an architecture the processes are mapped onto the subsystems,
`FIFO buffers on memories SMEM, and communications onto the system-(cid:3)level interconnect IM.
`(cid:3)[0003] Buffering is essential in a proper support of data streaming between the involved processes. Typically, FIFO
`buffers are used for streaming, which is in accordance to (bounded) Kahn process network models of streaming appli-
`cation. With increased number of multimedia applications that can run simultaneously the number of processes, real-
`time streams, as well as the number of associated FIFOs, substantially increases.
`(cid:3)[0004] There exist two extreme implementations of streaming with respect to memory usage and FIFOs allocation.
`The first uses physically distributed memory, where FIFO buffers are allocated in a local memory of a subsystem. The
`second uses physically and logically unified memory where all FIFO buffers are allocated in a shared, often off-(cid:3)chip,
`memory. A combination thereof is also possible.
`(cid:3)[0005] The FIFO buffers can be implemented in a shared memory using an external DRAM memory technology.
`SDRAM and DDR-(cid:3)SDRAM are the technologies that deliver large capacity external memory at low cost, with a very
`attractive cost to silicon area ratio.
`(cid:3)[0006] Fig. 2 shows a basic architecture of a system on chip with a shared memory streaming framework. The process-
`ing units C, S communicate with each other via the buffer B. The processing units C, S as well as the buffer each are
`associated to an interface unit IU for coupling them to an interconnect means IM. In case of a shared memory date
`exchange, the memory can also be used for other purposes. The memory can for example also be used for the code
`execution or a dynamic memory allocation for the processings of a program running on a main processor.
`(cid:3)[0007] Such a communication architecture or network, including the interconnect means, the interface units as well
`as the processing units C, S and the buffer B, may provide specific transport facilities and a respective infrastructure
`giving certain data transport guarantee such as for example a guaranteed throughput or a guaranteed delivery for an
`error-(cid:3)free transport of data or a synchronization service for synchronizing source and destination elements such that no
`data is lost due to the under or overflow of buffers. This becomes important if real-(cid:3)time streaming processing is to be
`performed by the system and a real-(cid:3)time support is required for all of the components.
`(cid:3)[0008] Within many systems-(cid:3)on-(cid:3)chip (SoC) and microprocessor systems background memory (DRAM) are used for
`buffering of data. When the data is communicated in a streaming manner, and buffered as a stream in the memory, pre-
`fetch buffering can be used. This means that the data from the SDRAM is read beforehand and kept in a special (pre-
`fetch) buffer. When the read request arrives it can be served from local pre-(cid:3)fetch buffer, usually implemented in on-(cid:3)chip
`SRAM, without latency otherwise introduced by background memory (DRAM). This is similar to known caching techniques
`of random data for processors. For streaming, a contiguous (or better to say a predictable) addressing of data is used
`in a pre-(cid:3)fetch buffer, rather then a random address used in a cache. For more details, please refer to J. L. Hennessy
`and D. A. Patterson "Computer Architecture -- A Quantitative Approach"
`(cid:3)[0009] On the other hand, due to DRAM technology, it is better to access (read or write) DRAM in bursts. Therefore,
`often a write-(cid:3)back buffer is implemented, which gathers many single data accesses into a burst of accesses of a certain
`size. Once the initial processing is done for the first DRAM access, every next data word, with address in a certain
`relation to the previous one (e.g. next, previous - depending on a burst policy), accessed in every next cycle of the
`memory can be stored without any further delay (within 1 cycle), for a specified number of accesses (2/4/8/full page).
`Therefore, for streaming accesses to memory, when addresses are increased or decreased in the same way for every
`access (e.g. contiguous addressing) the burst access provides the best performance at the lowest power dissipation.
`For more information regarding the principles of a DRAM memory, please refer to Micron’s 128-(cid:3)Mbit DDRRAM specifi-
`cations,(cid:3)
`
`5
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`2
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2172, p. 2
`
`

`

`EP 1 820 309 B1
`http:(cid:3)//(cid:3)download.micron.com/pdf/(cid:3)datasheets/(cid:3)dram/ddr/(cid:3)128MbDDRx4x8x16.pdf, which is incorporated by reference.
`(cid:3)[0010] Until now, controllers of external DRAM were designed to work in bus-(cid:3)based architectures. Buses provide
`limited services for data transport, simple medium access control, and best effort data transport only. In such architectures,
`the unit that gets the access to the bus automatically gets the access to the shared memory. Moreover, the memory
`controllers used in such systems are not more than access blocks optimized to perform numerous low latency reads or
`writes, often tweaked for processor random cache-(cid:3)like burst accesses. As a side effect of the low-(cid:3)latency, high-(cid:3)bandwidth,
`and high-(cid:3)speed optimizations of the controllers, the power dissipation of external DRAM is relatively high.
`(cid:3)[0011] The above-(cid:3)mentioned network services are, however, only applicable within the network. As soon as a data
`exchange occurs to any component outside the network, the network service guarantees are not met. Within a shared
`memory architecture, data which is to be buffered will be typically exchanged via the physically unified memory such
`that data need to be transported to and from the memory, whereby the data will break the services provided by the
`network as neither a memory controller nor the memory itself supports any of the network services.
`(cid:3)[0012] US 5,751,951 discloses a memory controller for coupling a memory to a network. The memory controller
`comprises a first network for connecting the memory controller to the network. The memory controller furthermore
`comprises a streaming memory unit having a buffer for temporarily storing at least part of the data streams and a buffer
`managing unit for managing a temporal storing of data streams in the buffer. The memory controller furthermore comprises
`an interface for exchanging data with the memory in bursts.
`(cid:3)[0013]
`It is an object of the invention to provide a memory controller for coupling a network and a memory as well as
`a method for coupling a network and a memory, which together with the memory improve the predictable behavior of
`the communication between the network and the memory.
`(cid:3)[0014] This object is solved by a memory controller according to claim 1 and by a method for coupling a network and
`a memory according to claim 3.
`(cid:3)[0015] A memory controller is provided for coupling a memory to a network. The memory controller comprises a first
`interface, a streaming memory unit and a second interface. The first interface is used for connecting the memory controller
`to the network for receiving and transmitting data streams. The streaming memory unit is coupled to the first interface
`for controlling data streams between the network and the memory. The streaming memory unit comprises a buffer for
`temporarily storing at least part of the data streams and a buffer managing unit for managing the temporarily storing of
`the data streams in the buffer. The second interface is coupled to the streaming memory unit for connecting the memory
`controller to the memory in order to exchange data with the memory in bursts. The streaming memory unit is provided
`to implement network services of the network onto the memory.
`(cid:3)[0016] Accordingly, with such a memory controller, a memory which does not implement the network services as
`provided by a network can be integrated with a communication network supporting specific network services. In other
`words, the same services will be applicable to the data being communicated within a network or to data which is exchanged
`with the memory sub-(cid:3)system.
`(cid:3)[0017] According to an aspect of the invention, the first interface is implemented as a PCI-(cid:3)Express interface such that
`the properties and network services of a PCI-(cid:3)Express network can be implemented by the memory controller.
`(cid:3)[0018] According to a further aspect of the invention, the memory is at least partly organized as FIFOs and a stream
`identifier is associated to every data stream from the network. The streaming memory unit is provided to control the data
`stream from/to the network by directing a particular data stream to a particular FIFO in the memory according to the
`stream identifier of the data stream. Furthermore, an arbitration is performed between the different data streams for
`accessing the memory. The second interface is arranged to exchange a relatively course grain stream of data with the
`memory and a relatively fine grain stream of data with the network. As the stream identifier of a data stream is used to
`map the data stream onto a FIFO in the memory, a simple addressing scheme is realized.
`(cid:3)[0019] According to a further aspect of the invention, the network is implemented as a PCI-(cid:3)Express network and a
`PCI-(cid:3)Express ID is used in the network for addressing purposes. The first interface is then implemented as a PCI-(cid:3)Express
`interface. The streaming memory unit converts a PCI-(cid:3)Express ID into a FIFO memory address as well as a FIFO memory
`address into a PCI-(cid:3)Express ID. Accordingly, the PCI-(cid:3)Express device addressing scheme is used to address the FIFO
`buffers within the memory.
`(cid:3)[0020] The invention also relates to a method for coupling a memory to a network. Data streams are received and
`transmitted via a first interface (PI) for connecting a memory controller to the network. The data streams between the
`network and the memory is controlled by a streaming memory unit (SMU). At least part of the data streams is temporarily
`stored in a buffer. The temporarily storing of the data streams in a buffer is managed. The streaming memory controller
`is coupled to the memory via a second interface and data is exchanged with the memory in bursts. network services of
`the network are implemented onto the memory.
`(cid:3)[0021] The invention relates to the idea of introducing a steaming memory controller associated to a shared memory.
`The streaming memory controller is able to provide the same services as a network. Such services may be flow control,
`virtual channels and memory bandwidth arbitration tuned to network bandwidth arbitration. Such services guaranteed
`by the network will then also be guaranteed by the memory controller if data leaves the network in order to be buffered
`
`5
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`3
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2172, p. 3
`
`

`

`EP 1 820 309 B1
`
`in the memory. The integrity of the network services will thus be preserved from the source of the data to its destination.
`(cid:3)[0022] Other aspects of the invention are subject to the dependent claims.
`(cid:3)[0023] These and other aspects of the invention are apparent from and will be elucidated with reference to the em-
`bodiments describe hereinafter and with respect to the following figures.(cid:3)
`
`Fig. 1 shows a basic representation of a Kahn process network and mapping of it onto a shared memory architecture;
`Fig. 2 shows a basic architecture of a system on chip with a shared memory streaming framework;
`Fig. 3 shows a block diagram of a system on chip according to the first embodiment;
`Fig. 4 shows the logical architecture of a SDRAM for the state when the memory clock is enabled;
`Fig. 5 shows a block diagram of a streaming memory controller SMC according to a second embodiment;
`Fig. 6 shows a block diagram of a logical view of the streaming memory controller SMC;
`Fig. 7 shows a block diagram of an architecture of a system on chip according to a third embodiment;
`Fig. 8 shows a format of an ID within a PCI-(cid:3)Express network;
`Fig. 9 shows a configuration within a PCI-(cid:3)Express system;
`Fig. 10 shows a block diagram of a system on chip according to the fourth embodiment;
`Fig. 11 shows an example of the memory allocation within the memory of Fig. 10; and
`Fig. 12 shows a power dissipation of external DDR-(cid:3)SDRAM versus the burst size of the access and worst-(cid:3)case
`delay versus buffer size in network packets.
`(cid:3)[0024] Fig. 3 shows a block diagram of a system on chip according to the first embodiment. A consumer C and a
`producer P is coupled to a PCI-(cid:3)express network PCIE. The communication between the producer and consumer P, C
`is performed via the network PCIE and a streaming memory controller SMC to an (external) memory MEM. The (external)
`memory MEM can be implemented as a DRAM or a SDRAM. As the communication between the producer P and the
`consumer C is a stream-(cid:3)based communication, FIFO buffers are provided in the external memory MEM for this com-
`munication.
`(cid:3)[0025] The streaming memory controller SMC according to Fig. 3 has two interfaces: one towards PCI Express fabric,
`and second towards the DRAM memory MEM. The PCI Express interface of the streaming memory controller SMC must
`perform the traffic shaping on the data retrieved from the SDRAM memory MEM to comply with the traffic rules of the
`PCI Express network PCIE. On the other interface of the streaming memory controller SMC, the access to the DRAM
`is performed in bursts, since this mode of accessing data stored in DRAM has the biggest advantage with respect to
`power consumption. The streaming memory controller SMC itself must provide intelligent arbitration of access to the
`DRAM among different streams such that throughput and latency of access are guaranteed. Additionally, the SMC also
`provides functionality for smart FIFO buffer management.
`(cid:3)[0026] The basic concept of a PCI-(cid:3)Express network is described in "PCI Express Base Specification, Revision 1.0",
`PCI-(cid:3)SIG, July 2002, www.pcisig.org.
`(cid:3)[0027] The features of a PCI Express network, which are taken into consideration in the design of the streaming
`memory controller, are: isochronous data transport support, flow control, and specific addressing scheme. The iso-
`chronous support is primarily based on segregation of isochronous and non-(cid:3)isochronous traffic by means of Virtual
`Channels VCs. Consequently, network resources like bandwidth and buffers are explicitly reserved in the switch fabric
`for specific streams, such that no interference between streams in different virtual channels VC is guaranteed. Additionally,
`the isochronous traffic, in the switch fabric, is regulated by scheduling, namely admission control and service discipline.
`(cid:3)[0028] The flow control is performed on a credit base to guarantee such that no data is lost in the network PCIE due
`to buffers under/(cid:3)overflows. Each network node is only allowed to transmit network packet through a network link to the
`other network node when the receiving node has enough space to receive the data. Every virtual channel VC comprises
`a dedicated flow control infrastructure. Therefore, a synchronization between the source and destination can be realized,
`through chained PCI Express flow control, separately for every virtual channel VC.
`(cid:3)[0029] The PCI Express addressing scheme typically uses 32 or 64 bit memory addresses. As no explicit memory
`addresses are to be used, device and function IDs, i.e. stream IDs, are used to differentiate between different streams.
`The memory controller SMC itself will generate/(cid:3)convert stream IDs into the actual memory addresses.
`(cid:3)[0030]
`In order to simplify the addressing scheme even further, the ID of the virtual channel VC is used as a stream
`identifier. Since PCI Express allows up to eight virtual channels VCs, half of them can be used for identifying incoming
`streams and the other half for identifying outgoing streams from the external memory. Therefore, the maximum number
`of streams that can access the memory through the memory controller SMC is limited to eight. Please note that such a
`limitation is due to PCI Express that allows for arbitration between streams in different VCs, and not between those
`inside the same virtual channel VC. However, such limitation is only specific to PCI Express based systems, it is not
`fundamental for the concepts of the present invention.
`(cid:3)[0031] Summarizing, the PCI Express interface of the memory controller SMC consists of a full PCI Express interface,
`equipped additionally with some logic necessary for address translation and stream identification.
`
`5
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`4
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2172, p. 4
`
`

`

`EP 1 820 309 B1
`In the first embodiment a (DDR)(cid:3)SDRAM memory is used. As an example one can refer to the Micron’s 128-
`(cid:3)[0032]
`Mbit DDR-(cid:3)SDRAM as described in Micron’s 128-(cid:3)Mbit DDRRAM specifications, http:(cid:3)//(cid:3)download.micron.com/pdf/(cid:3)datash-
`eets/(cid:3)dram/ddr/(cid:3)128MbDDRx4x8x16.pdf is used. Such technology is preferable since it provides desirable power con-
`sumption and timing behavior. However, the design is parameterized, and the memory controller SMC can be configured
`to work also with single rate memory. Since the DDR-(cid:3)SDRAM behaves similar to SDRAM, except the timing of the data
`lines, we explain basics using SDRAM concepts.
`(cid:3)[0033] The PCI Express network PCIE provides network services, e.g. guaranteed real-(cid:3)time data transport, through
`exclusive resource/(cid:3)bandwidth reservation in the devices that are traversed by the real-(cid:3)time streams. When an external
`DRAM supported by a standard controller is connected to the PCI Express fabric, without having any intelligent memory
`controller in between, bandwidth and delay guarantees, typically provided by the PCI Express, will not be fulfilled by the
`memory, since it does not give any guarantees and acts as a "slave" towards incoming traffic.
`(cid:3)[0034] The design of standard memory controller focuses on delivering the highest possible bandwidth at the lowest
`possible latency. Such approach is suited for processor data and instruction (cache) access and not for isochronous
`traffic. To be able to provide the predictable behavior of the PCI Express network extended with the external DRAM, a
`streaming memory controller is needed, which guarantees a predictable behavior of the external memory for streaming.
`In addition, we aim to design the memory controller not only for guaranteeing throughput and latency, but also for reducing
`power consumption while accessing this DRAM.
`(cid:3)[0035] Fig. 4 shows the logical architecture of a SDRAM for the state when the memory clock is enabled, i.e. the
`memory is in one of the power up mode. The SDRAM comprise a logic unit L, an memory array AR, and data rows DR.
`When the clock is disabled, the memory is in low power state (power down mode).
`(cid:3)[0036] Typical commands applied to a memory are activate ACT, pre-(cid:3)charge PRE, read/(cid:3)write RD/WR, and refresh.
`The activate command takes care that after charging a bank and row address are selected and the data row (often
`referred to as a page) is transferred to the sense amplifiers. The data remains in the sense amplifiers until the pre-(cid:3)charge
`command restores the data to the appropriate cells in the array. When data is available in the sense amplifiers SAM,
`the memory is said to be in the active state. During such a state reads and writes can take place. After pre-(cid:3)charge
`command, the memory is said to be in the pre-(cid:3)charge state where all data is stored in cell array. Another interesting
`aspect of memory operation is a refresh. The memory cells of the SDRAM store data using small capacitors and these
`must be recharged regularly to guarantee integrity of data. When powered up, the SDRAM memory is instructed by
`controller to perform refresh. When powered down, SDRAM is in self-(cid:3)refresh mode, (i.e. no clock is enabled) and the
`memory performs refresh on its own. This state consumes very little power. Getting memory out of the self-(cid:3)refresh mode
`to the state in which data can be asserted for read or write takes more time than for others modes (e.g. 200 clock cycles,
`specifically for DDR-(cid:3)SDRAM).
`(cid:3)[0037] The timing and power management of the memory is important for proper design of the memory controller
`SMC that must provide specific bandwidth, latency and power guarantees. Reading a full page (equal to 1Kbyte), from
`an activated SDRAM, may take about 2560 clock cycles (~19.2 us) for burst length of 1 read, 768 clock cycles (~5.8 us)
`for burst length of 8 reads, and only 516 clock cycles (~3.9 us) for full page burst. These values are based on the specific
`128-(cid:3)Mbit DDR-(cid:3)SDRAM with clock period of 7.5 ns as described in "Micron’s 128-(cid:3)Mbit DDRRAM specifications,(cid:3)
`http:(cid:3)//(cid:3)download.micron.com/pdf/(cid:3)datasheets/(cid:3)dram/ddr/(cid:3)128MbDDRx4x8x16.pdf".
`(cid:3)[0038] Fig. 5 shows a block diagram of a streaming memory controller SMC according to a second embodiment. The
`streaming memory controller SMC comprises a PCI-(cid:3)Express interface PI, a streaming memory unit SMU and further
`interface MI which serves as interface to an (external) SDRAM memory. The streaming memory unit SMU comprises a
`buffer manager unit BMU, a buffer B, which may be implemented as a SRAM memory, as well as an arbiter ARB. The
`streaming memory unit SMU that implements buffering in SRAM, is together with the buffer manager used for buffering
`an access via PCI-(cid:3)Express Interface to the SDRAM. The buffer manager unit BMU serves to react to read or write
`accesses to SDRAM from the PCI-(cid:3)Express Interface, to manage the buffers (update pointer’s registers) and to relay
`data from/to buffers (SRAM) and from/to SDRAM. In particular, the buffer manager unit BMU may comprise a FIFO
`manager and a stream access unit SAU.
`(cid:3)[0039] The stream access unit SAU provides a stream ID, an access type, and the actual data for each stream. For
`each packet received from PCI Express interface, based on its virtual channel number VC0 - VC7, the stream access
`unit SAU forwards the data to an appropriate input buffer, implemented in local shared SRAM memory. For data retrieved
`from (DDR-) SDRAM’s FIFOs, and placed in output buffer B in local SRAM, it generates destination address and passes
`the data to the PCI Express interface PI. The Arbiter ARB decides which stream can access the (DDR-)(cid:3)SDRAM. The
`SRAM memory implements the input/(cid:3)output buffering, i.e. for pre-(cid:3)fetching and write-(cid:3)back purposes. The FIFO manager,
`which is at the heart of SMC, implements FIFO functionality for the memory through address generation for streams,
`access pointers update, and additional controls.
`(cid:3)[0040] Fig. 6 shows a block diagram of a logical view of the streaming memory controller SMC. Each of the streams
`ST1-(cid:3)ST4 are associated to a separate buffer. As only one stream at the time can access the external DRAM an arbiter
`ARB is provided which performs the arbitration in combination with a multiplexer MUX.
`
`5
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`5
`
`PATENT OWNER DIRECTSTREAM, LLC
`EX. 2172, p. 5
`
`

`

`5
`
`10
`
`15
`
`20
`
`25
`
`30
`
`35
`
`40
`
`45
`
`50
`
`55
`
`EP 1 820 309 B1
`(cid:3)[0041] The arbitration of the memory access between different real-(cid:3)time streams is essential for guaranteeing through-
`put and bounded access delay. Assume that whenever data is written to or read from the memory, a full page is either
`written or read, i.e. the access is performed in bursts. The time needed to access one page (slightly different for read
`and write operations) can be referred to as a time slot. A service cycle is defined as consisting of a fixed number of time
`slots. The access sequence repeats and resets as every new service cycle is started.
`(cid:3)[0042] The arbitration algorithm between streams according to the second embodiment is credit based. Each stream
`gets a number of credits (time slots) reserved, the same for every service cycle. The number of credits reflects bandwidth
`requirements of the stream. Each time an access is granted to the stream the number of credits available for the granted
`stream decreases. Credit count per stream is updated every time the arbitration occurs. Furthermore, credits are reset
`at the end of service cycle to guarantee periodicity of arbitration process. The credit counts can also be refreshed only
`(e.g. all decreased by the lowest value of all counts) to provide arbitration memory of previous service cycles, in case
`adaptive arbitration over a longer time is needed. In extreme case, single service cycle infinitely long can be used.
`(cid:3)[0043] When multiple streams want to access the memory in the same time slot, the credit count is used as an
`arbitration criterion. The stream that has used the least of its credits (relatively, measured as ratio between used and
`reserved credits per current service cycle) gets the access. The denied request is buffered and scheduled (or arbitrated
`with another incoming request), for the next time slot. In case the credit ratios are the same for two requesting streams,
`the one that requires lower access latency gets the access first (e.g. read over write).
`(cid:3)[0044]
`In this way, every stream (if requesting) gets in worst case the reserved number of accesses to the memory
`per service cycle, regardless the order of the incoming requests or the behavior of the other streams. This guarantees
`that the bandwidth requirement for every stream is met.
`(cid:3)[0045] Now an example of the credit-(cid:3)based arbitration algorithm is described in more detail. A time slot is defined as
`equal to a page (1KB) access to SDRAM memory MEM that, as calculated before, is equal to 3.9 Ps. Moreover, it is
`assumed that the service cycle has 60 time slots, so it is equal to 234 Ps. Therefore, there will be 4273 service cycles
`per second,(cid:3) what results in the total memory bandwidth of about 2 Gbit/s (4237*60*1KB). It is assumed that 3 streams
`each having respectively 350 Mbit/s, 700 Mbit/s, and 1050 Mbit/s of bandwidth requirements are provided. Therefore,
`the reserved credit count per service cycle of the first stream ST1 will be 350/2100 time 60 slots, what equals to 10 slots.
`Stream 2 and 3 ST2, ST3 will have 20 and 30 reserved credits, respectively. Table 1 shows the stream schedule (row
`Sdl) that is a result of the arbitration. It also shows credit (bandwidth) utilization levels that determine the arbitration
`result (rows CS1, CS2, CS3 - measured as ratio between used and reserved credits per current service cycle) per each
`
`Table 1. Example of the Credit Based Arbitration
`
`time slot (row Slot).(cid:3)(cid:3)[0046] While the reserved bandwidth is always guaranteed for each stream, the reserved but unused slots can be
`
`Slot
`
`CS1
`
`CS2
`
`CS3
`
`Sdl
`
`1
`
`0.1
`
`0
`
`0
`
`S1
`
`2
`
`0.1
`
`0.05
`
`0
`
`S2
`
`3
`
`0.1
`
`0.05
`
`0.03
`
`S3
`
`4
`
`0.1
`
`0.05
`
`0.06
`
`S3
`
`5
`
`0.1
`
`0.1
`
`0.06
`
`S2
`
`6
`
`0.1
`
`0.1
`
`0.1
`
`S3
`
`7
`
`0.2
`
`0.1
`
`0.1
`
`S1
`
`8
`
`0.2
`
`0.15
`
`0.1
`
`S2
`
`9
`
`0.2
`
`0.1
`
`10
`
`0.2
`
`0.1
`
`11
`
`0.2
`
`0.2
`
`0.13
`
`0.16
`
`0.16
`
`S3
`
`S3
`
`S2
`
`reused by other streams if necessary. This also enables flexible allocation of the bandwidth. While keeping all guarantees,
`it enables flexible handling of the unavoidable fluctuations in the network.
`(cid:3)[0047] Furthermore, sufficient buffering of the incoming requests must be provided to ensure that the above scheme
`works. A mechanism of stalling the requesting streams in case other streams are granted the access is also required.
`The stalling mechanism may be implemented using PCI Express flow control, which enables delaying of any stream,
`separately per each virtual channel VC. The minimal buffering required can be therefore equal to the size of the data
`accessed from memory during one time slot, i.e. one page. Increasing the access buffering is therefore not needed.
`However, it will decrease access latency, as such buffers then behave as pre-(cid:3)fetch or write-(cid:3)back buffers.
`(cid:3)[0048] The mentioned over-(cid:3)dimensioning of I/O buffers relaxes the arbitration. The proposed arbitration algorithm is
`all parameterized. Most of the aspects of the arbitration can be programmed. For example, the particular arbitration
`strategy can be chosen at the configuration time, the granularity of memory access (a time slot) can be changed from
`a page to a burst of other length, and finally the number of tim

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket