`Wilford et al.
`
`
`
`1111111111110 1111111111199811 J111111111111111110111111
`
`US 6,687,247 B1
`*Feb. 3, 2004
`
`(to) Patent No.:
`(45) Date of Patent:
`
`(54) ARCHITECTURE FOR HIGH SPEED CLASS
`OF SERVICE ENABLED LINECARD
`
`(75)
`
`Inventors: Bruce Wilford, Los Altos, CA (US);
`Yie-Fong Dan, Cupertino, CA (US)
`
`(73)
`
`Assignee: Cisco Technology, Inc., San Jose, CA
`(US)
`
`(*)
`
`Notice:
`
`This patent issued on a continued pros-
`ecution application filed under 37 CFR
`1.53(d), and is subject to the twenty year
`patent term provisions of 35 U.S.C.
`154(a)(2).
`
`Subject to any disclaimer, the term of this
`patent is extended or adjusted under 35
`U.S.C. 154(b) by 0 days.
`
`(21) Appl. No.: 09/428,870
`
`(22) Filed:
`
`Oct. 27, 1999
`
` HO4L 12/28
`(51) Int. C1.7
` 370/392; 370/412
`(52) U.S. Cl.
` 370/392, 389,
`(58) Field of Search
`370/393, 475, 397, 400, 471, 399, 349,
`401, 402, 413, 412; 709/238, 265
`
`(56)
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`
`4/1996 Wilford et al.
`5,509,006 A
`5,574,910 A * 11/1996 Bialkowski et al.
`5,781,532 A * 7/1998 Watt
`5,802,287 A * 9/1998 Rostoker et al.
`
`370/60.1
`707/1
`370/236
`370/395.5
`
`12/1998 McHale et al.
`5,852,655 A
`2/1999 Chin
`5,872,783 A
`6,157,641 A * 12/2000 Wilford
`6,259,699 B1 * 7/2001 Opalka et al.
`6,449,271 B1 * 9/2002 Lenell et al.
`6,463,067 B1 * 10/2002 Hebb et al.
`
`379/93.14
`370/392
`370/389
`370/389
`370/383
`370/413
`
`* cited by examiner
`
`Primary Examiner—Chi Pham
`Assistant Examiner Alexander 0. Boakye
`(74) Attorney, Agent, or Firm—Campbell Stephenson
`Ascolese LLP
`
`(57)
`
`ABSTRACT
`
`A linecard architecture for high speed routing of data in a
`communications device. This architecture provides low
`latency routing based on packet priority: packet routing and
`processing occurs at line rate (wire speed) for most opera-
`tions. A packet data stream is input to the inbound receiver,
`which uses a small packet FIFO to rapidly accumulate
`packet bytes. Once the header portion of the packet is
`received, the header alone is used to perform a high speed
`routing lookup and packet header modification. The queue
`manager then uses the class of service information in the
`packet header to enqueue the packet according to the
`required priority. Enqueued packets are buffered in a large
`memory space holding multiple packets prior to transmis-
`sion across the device's switch fabric to the outbound
`linecard. On arrival at the outbound linecard, the packet is
`enqueued in the outbound transmitter portion of the linecard
`architecture. Another large, multi-packet memory structure,
`as employed in the inbound queue manager, provides buff-
`ering prior to transmission onto the network.
`
`41 Claims, 31 Drawing Sheets
`
`Lookup Memory
`
`227
`
`130
`Control Element
`
`Inbound Packet Buffer
`
`245
`
`7 225
`
`f 230
`
`Rate
`Limiter
`
`Inbound
`Queue
`Manager
`240
`
`f 170
`
`Fabric
`Interface
`
`Switch Fabric
`III J 120
`
`f 280
`Outbound
`Queue
`Manager
`
`f 270
`
`Rate
`Limiter
`
`Outbound
`Receiver
`
`Fabric
`Interface
`
`Lookup
`Circuit
`
`Inbound
`Receiver
`k220
`
`285 \
`
`Outbound Packet Buffer
`
`
`
`767 — 170' -—
`
`113
`
`Network
`
`Network
`Physical
`Interface
`
`217-7 —
`
`Cloudflare - Exhibit 1012, page 1
`
`Cloudflare - Exhibit 1012, page 1
`
`
`
`Waled *S11
`
`1£ Jo 1 laNS
`
`iff LitZ`L89`9 Sa
`
`Inbound
`Packet
`113
`
`r
`
`140-\
`Inbound
` Receiver
`
`-r41.
`
`Input
`--IP- Inter-
`face
`
`111
`
`Communications
`Network
`
`112
`
`Output
`Inter-
`face
`
`Outbound
`Packe t
`114
`
`Lookup
`Circuit
`
`145-f
`
`180N
`Outbound
`Transmitter
`
`Control
`Element
`
`130
`
`170 \
`
`Fabric
`Interface
`
`150
`
`4,100
`
`120
`
`Inbound
`Packe t
`113
`
`Switching_
`(Crossbar)
`Fabric
`
`Outbound
`Packet
`114
`
`Memory Controller
`(Inbound)
`
`1501
`
`160.
`
`Memory
`
`Memory Controller
`(Outbound)
`
`190 \
`
`160\ Memory
`
`Control
`Circuits
`
`110
`
`Line Card
`
`FIG. 1
`
`Z efied `z1.01. l!q!LixD - wegpno10
`
`Cloudflare - Exhibit 1012, page 2
`
`
`
`Waled *S11
`
`1£ Jo Z laNS
`
`iff LtZ`L,89`9 Sfl
`
`e
`
`Lookup Memory
`
`
`
`227 _. -
`
`/- 130
`Control Element
`
`,
`
`Inbound Packet Buffer .
`
` 245
`
`Lookup
`Circuit
`
`Inbound
`Receiver
`
`k 220
`
`113
`
`Network
`
`Network
`Physical
`Interface
`
`210-1
`
`/ 225
`
`i 230
`
`Rate
`.....41. Limiter
`
`Inbound
`Queue
`Manager
`
`---7-2-4T
`
`f 280
`Outbound
`Queue
`Manager
`
`f 270
`
`Rate
`Limiter
`
`Outbound
`Receiver
`
`f 170
`
`Fabric
`Interface
`
`Switch Fabric
`_,--- 120
`
`i
`
`Fabric
`Interface
`
`170
`
`114
`
`285 \
`
`FIG. 2
`
`Outbound Packet Buffer
`
`c efied `z1,0 I, wpixD - wegpnon
`
`Cloudflare - Exhibit 1012, page 3
`
`
`
`Waled *S11
`
`i£ 3° £ 13311S
`
`iff L,17Z`L,89`9 Sfl
`
`I 230
`r330
`
`r 340
`Rate
`Check
`
`CAR
`
`CAR Results
`
`Control Info
`
`UU Results
`
`350 \ 1
`
`Packet
`Modifier
`
`240
`
`Inbound
`Queue
`Manager
`
`Lookup Memory --- 227
`
`I/ 225
`
`
`
`Lookup
`Circuit
`
`Headers
`
`.wi 220
`
`'315
`
`Input FIFO
`
` 210
`f
`Network
`Physical
`Interface
`
`Lookup
`Controller
`
`Headers
`
`320
`Complete Packets
`
`• FIFO
`Controller
`
`310 -I I
`
`FIG. 3
`
`17 efied `z1,0 I, mtixD - wegpnon
`
`Cloudflare - Exhibit 1012, page 4
`
`
`
`Waled *S11
`
`ICJ° 1713311S
`
`iff LitZ`L89`9 Sa
`
`i 240
`
`230 \
`
`Rate
`Limiter
`
`Free Block Queue ./ 415
`
`x410
`
`Queue
`Manager
`
`Pointers
`
`I
`Packets
`
`r 420
`
`••••••••1111111m Dequeue
`Circuit
`
`f 170
`
`Fabric
`Interface
`
`Inbound Packet Buffer
`
`245
`
`CPU
`
`FIG. 4
`
`g efied `z1,0 I, wpixD - wegpnon
`
`Cloudflare - Exhibit 1012, page 5
`
`
`
`Waled *S11
`
`i£ 3° SiamIS
`
`iff LitZ`L,89`9 Sfl
`
`260
`
`r
`
`J
`
`j
`
` 270
`
`e
`
`.z 512
`
`Multicast FIFO
`
`Fabric
`Interface
`
`t 170
`
`../- 510
`Multicast
`Unicast Pkts
`Duplication
`
`Module
`
`514
`./.
`
`Unicast FIFO
`
`f 520
`
`f 530
`
`f 540
`
`Packet
`Modifier -r-0 ,
`
`at
`Check
`
`CAR
`
`280
`
`Outbound
`Queue
`Manager
`
`FIG. 5
`
`9 efied `z1,0 I, wpixD - wegpnon
`
`Cloudflare - Exhibit 1012, page 6
`
`
`
`Waled *S11
`
`1£ .10 9 13311S
`
`iff LitZ`L89`9 Sa
`
`f 210
`Network
`Physical
`Inteac
`
`r
`
`Free Block Queue ./- 615
`
`270 \
`
`Rate
`Limiter
`
`x 610
`
`Queue
`Manager
`
`Pointers
`
`i 280
`
`Packets
`
`f 620
`
`1650
`
`Rate
`Pacing
`
`11111111•1•11111111=11.06
`
`Dequeue
`Circuit
`
`CPU r 440
`
`Outbound Packet Buffer
`
`-OP
`
`\ 285
`
`FIG. 6
`
`L efied `z1.01. l!q!LixD - wegpno10
`
`Cloudflare - Exhibit 1012, page 7
`
`
`
`Waled *S11
`
`1£ Jo L WINS
`
`iff L,17Z`L,89`9 Sfl
`
`40-
`
`PLIM Interface
`
`1"-1*---- L3 Switch Engine ------4,..____
`
`Fabric
`Interface
`
`MCC Packet
`
`memory i
`
`1
`
`RX_memory
`
`RX
`
`nomelowoo.•••••••••110.
`
`LU
`
`"RX ASIC"
`
`PLIM
`asic
`(aka PHAD)
`
`To FAB F .
`TFIA 192
`asic
`
`serf
`des
`
`,
`
`xver
`framer
`
`xver
`framer
`
`xver
`framer
`
`•••••••1••••
`
`xver
`framer
`
`TX_memory
`
`TX
`asic
`
`I
`Tx Packet
`memory
`
`From FAB
`FFIA,192
`am
`
`I
`From_Fab
`pkt reassembly
`memory
`
`FIG. 7
`
`8 efied `z1,0 I, wpixD - wegpnon
`
`Cloudflare - Exhibit 1012, page 8
`
`
`
`U.S. Patent
`
`Feb. 3, 2004
`
`Sheet 8 of 31
`
`US 6,687,247 B1
`
`Packet
`Buffer
`FIFO
`
`•••••••••••• PLIM
`Interface
`
`LU
`Header
`FIFO
`
`Header
`to parse
`
`RX ASIC
`
`To Memory
`ControllerI.
`
`Packet
`Modification
`(Tag Push/Pop
`COS rewrite
`IP updates)
`1
`
`CAR Result
`FIFO
`#
`
`LU Result
`FIFO
`
`RXLUXOFF
`
`CAR
`Control
`
`T info
`Output Port
`
`Precedence
`car drop
`
`LURXXOFF
`
`lookup.CORE
`
`FIG. 8
`
`w
`
`CPU
`Interface
`Control
`
` -
`
`Cloudflare - Exhibit 1012, page 9
`
`Cloudflare - Exhibit 1012, page 9
`
`
`
`U.S. Patent
`
`Feb. 3, 2004
`
`Sheet 9 of 31
`
`US 6,687,247 B1
`
`Packet Flow
`
`RX ASIC
`
`74 pins©200MHz
`
`•
`
`54 pins@lOOMHz
`
`
`
` 36b@100MHz
`•
`23b@100MHz
`
`LU
`CORE
`
`54b©100MHz
`
`23b@100MHz
`•
`
`14 x 32b@lOOMHz
`
`4 x 20b@100MHz
`
`Forwarding Table
`SDRAM
`128M13
`8-TSOP54
`8@8Mx16
`
`Adjacency
`Table
`SRAM
`1-TSOP100
`1@2561<x36
`
`Fwd'ing Stats
`S
`RAM
`
`3-TSOP100
`3@512Kx18
`
`FIG. 9
`
`Cloudflare - Exhibit 1012, page 10
`
`Cloudflare - Exhibit 1012, page 10
`
`
`
`U.S. Patent
`
`Feb. 3, 2004
`
`Sheet 10 of 31
`
`US 6,687,247 B1
`
`rxludata[127:01
`rxlucmd[2:0]
`lu_hdr xoff
`
`I I I
`
`L3
`Extractor
`
`M
`
`togrom cat_core
`
`Bypass
`Pkt
`FIFO
`
`Switched
`Pkt
`FIFO
`
`Pkt
`Info
`Storage
`
`Results
`
`FIFO
`
`Feedback
`FIFO
`t
`
`Ext
`SDRAM
`
`IP
`Cache
`
`Able
`Controller
`
`IPC
`Bypass
`FIFO
`
`Reorder
`Buffer
`
`LU
`Stats
`
`E
`SRAM
`
`Salsa
`Salsa
`Addr/Ctrl Data
`1 1 f
`LU
`Salsa
`Intfc
`pktmod_lu_xoff
`to pkt_mod
`
`1111.%
`
`CPU
`Register
`Intfc
`
`FIG. 10
`
`Cloudflare - Exhibit 1012, page 11
`
`Cloudflare - Exhibit 1012, page 11
`
`
`
`U.S. Patent
`
`Feb. 3, 2004
`
`Sheet 11 of 31
`
`US 6,687,247 B1
`
`Node Fields
`51 4847 44 19
`Offset Length
`0010 0001 W
`- -
`
`,
`
`BadieAddiiess
`X
`Y
`Z
`
`....s..--e..„.
`
`'..
`
`0
`
`0
`..„
`
`_„-
`
`..--'
`
`23.-----
`WiXiY
`
`Z
`
`..-----
`-;-,---:., -
`„---- --..----,
`..._
`„
`-„--
`......
`„
`-IL
`-----0 31
`v.
`0 0
`A;BiCiDIEIFIGiH
`
`Base Address
`
`IP Address
`
`OR
`
`23
`WiX iY Z E F
`
`FIG. 11
`
`Cloudflare - Exhibit 1012, page 12
`
`Cloudflare - Exhibit 1012, page 12
`
`
`
`Waled *S11
`
`i£ JO Z1 laNS
`
`iff LtZ`L,89`9 Sfl
`
`ASIC
`
`rx_data[63:0]
`rx_cmd[2:01
`rx_sclk_even
`rx sclk_odd
`rx_data_par[3:01
`rx_cmd_par
`
`mcrxoff
`
`Packet Mem
`SDRAM
`128©200MHzi
`
`PackeRAMt Mem
`SD
`j 128@200MHz
`
`DRAM Controller
`
`Read.
`Output
`Scheduling •••••:1Processing
`
`Cell Write Fifo
`
`_L.
`
`DPF
`
`PM
`
`RED
`
`Input MDRR
`Input
`Module
`Cell Fifo7 Crtl
`
`Queue
`Update
`Filo
`
`t t—
`Queue Manager
`r ; Queue State
`Memory
`
`DIAG IF
`
`CPU I/F
`
`MCC
`
`SIM
`
`Free Block Mem IF
`
`if datap 27:01
`tf cmd[2:01
`tf
`num[5:0]
`mcc tf sclkevet
`mcc_tf sclk odd,
`tf data_par17:01
`ff cmd_par
`
`tf fifo stat[7:0]
`ff fifo_stat_staLl
`tf fifo_stat_par
`
`FIG. 12
`
`32/
`SALSA
`ASIC
`
`16©100MHz
`Free
`BICTOTTIO1
`SRAM
`
`Q backpressure
`Module
`
`gbackpressure[15:O]
`
`Cloudflare - Exhibit 1012, page 13
`
`Cloudflare - Exhibit 1012, page 13
`
`
`
`Waled *S11
`
`i£ JO £i laNS
`
`iff LtZ`L,89`9 Sfl
`
`Packet Buffer
`2x72 Dimms
`1280200 MHz t
`
`FIG. 13
`
`Packet Buffer
`2x72 Dimms
`t1280200 MHz
`rd data
`j rd_addr
`rd_addr
`
`SD RAM Interface
`rst ft
`_13
`
`-a)
`
`MIRM MMEM
`lfip-R1111
`
`CeN Write No -
`64311s x 611/ite
`
`Output Crtl
`—.1 kill
`QuOut Cei 1pe
`
`slot# fifo full /
`
`Channel Queues
`A A
`
`tx_data
`
`••••••••••••••••
`
`-n
`
`8
`
`to FIA
`192 asic
`128
`0200 Mhz
`
`0 I I r fib
`block
`
`Queue Manager
`Internal
`Q
`memory
`
`head req
`Aead_gnt
`head_ptro., 1 1
`qlen_copt
`qlen_update
`
` cell_addr
`tx_len
`q#
`Channels_ full
`
`4i
`CPU Interface
`
`to CPU
`
`from
`Fbc Asic
`64
`@200MHz
`
`rx_fifo_fu II
`rx_data
`rX_cmd m
`
`cpu cell
`fifo
`
`free block
`ph'
`nil
`
`tr
`
`16
`16
`Free Block
`
`Cloudflare - Exhibit 1012, page 14
`
`Cloudflare - Exhibit 1012, page 14
`
`
`
`Poled °S11
`
`i£ JO 17T PaliS
`
`lit LteL89`9 Sfl
`
`Packet Memory
`
`Input Phase
`
`a. ill
`
`Output Phase I,
`
`Input
`Processing
`Logic
`
`t
`
`Queue
`Manager
`
`smillpo.
`
`41--10-
`
`Interface
`
`Packets
`
`IIIIIIIM.
`
`Aavaillpa
`
`Read
`Scheduler!
`Output
`Processing
`
`MDRR
`Module
`
`FIG. 14
`
`I.111.1•MMMMMMI•1•Ml•I•Nll........lk
`
`incoming --0- Rx millir R
`Interface .110.....11P. ModEuDle wie........
`Packets -*-4-
`
`implpip
`
`1110111111111111111111111•111•6
`
`4N/N/INMI
`
`IINS6
`
`-40—r- Data Flow
`1... Control Flow
`
`" 11
`
`Cloudflare - Exhibit 1012, page 15
`
`Cloudflare - Exhibit 1012, page 15
`
`
`
`Waled *S11
`
`i£ JO SI 13311S
`
`iff LtZ`L,89`9 Sfl
`
`Alternately Serviced
`
`....
`
`4..
`
`*** '
`
`•
`
`• ..........
`•
`
`•
`
`...,
`•
`
`•
`
`•
`
`Input
`Output
`Phase Phase
`
`0101/0
`
`Memory/Bank
`writes
`
`0/3 1/3
`
`/
`
`,
`
`I
`0
`
`012 1/2 /
`
`I
`
`0/1 1/1
`
`........
`
`Memory/Bank
`reads
`
`0/01/0
`
`
`
`..--/
`0/3 1/3
`
`200ns: 4 bursts of 4 write: 80ns, 4 bursts of 4
`read: 80ns, turnaround: 40ns
`
`FIG. 15
`
`Cloudflare - Exhibit 1012, page 16
`
`Cloudflare - Exhibit 1012, page 16
`
`
`
`U.S. Patent
`
`Feb. 3, 2004
`
`Sheet 16 of 31
`
`US 6,687,247 B1
`
`Module/Bank 010 0/1 0/2 0/3
`
`1/0 1/1 1/2 1/3
`
`28
`
`28
`
`FIG. 16A
`
`Cells are stored in an
`interleaved order across
`memory modules to
`improve utilization
`
`Memory Layout of Block of Cells for a TX Queue
`Module/Bank 0/0
`011
`1/0
`1/1 0/2 1/2 0/3 1/3
`First cell of
`P1.1 P1.2 P1.3 P2.1 P2.2 P3.1 P4.1 P4.2
`first packet
`*P1 in Tx
`queue
`
`P4.3 P4.4 P4.5 P4.6 P5.1
`
`...I
`
`
`1 ChL nk: 8
`conse
`utive
`
`cells ac ross all
`banks
`n both
`mem
`es
`
`r•
`
`64 Byte Cell:
`DRAM Burst of 4
`to one bank in
`one memory
`module
`
`cel in
`Last
`k has
`pointer to
`next block
`Next k_
`for this
`Block
`TxPacqueue.
`Ptr
`ket
`data may
`span
`across
`blocks.
`
`FIG. 1611
`
`Cloudflare - Exhibit 1012, page 17
`
`Cloudflare - Exhibit 1012, page 17
`
`
`
`Waled *S11
`
`1£ Jo LI WINS
`
`iff LtZ`L,89`9 Sfl
`
`DRAM
`CONTROLLER
`
`Packet
`Data
`
`•
`
`256
`
`SOURCE
`CLK
`FIA/PLIM
`interface
`
`Bank
`Addresses
`READ
`SCHEDULER
`Generate DRAM
`bank read addresses
`Service slots round
`robin
`
`FIA fllos
`Status
`Queue
`R
`Requeseadts
`MDRR
`9
`Select output Channels Busy
`queues to read
`Bytes read from queue
`
`Queue length
`changes from
`Queue Mgr
`
`Queue head
`addresses from
`Queue Mgr
`
`To Queue Mgr
`
`Cells read from queue
`
`Cell
`Headers
`
`Cell Select
`and control
`DRAM
`Read
`Descriptor
`
`Request
`Done
`
`POST
`PROCESSOR
`Process cells read
`for packet
`boundaries and
`lengths.
`
`FIG. 17
`
`Cloudflare - Exhibit 1012, page 18
`
`Cloudflare - Exhibit 1012, page 18
`
`
`
`U.S. Patent
`
`Feb. 3, 2004
`
`Sheet 18 of 31
`
`US 6,687,247 B1
`
`Output
`Channels
`Full
`
`Queue
`Read
`Request
`
`If Fair
`Mode
`round robin
`among
`slots,
`mcast, pu
`else round
`robin all
`slots with
`hi priority
`data till
`none ready
`and RS is
`blocked
`then round
`robin slots
`with low
`priority
`data.
`
`Send
`requests to
`read queue
`to channels
`ready for
`data.
`
`Queue
`Length
`Copy
`
`Queue
`Head
`Address
`
`Slot 0
`If Fair Mode
`round robin
`among ports
`else round robin
`all ports with hi
`priority data till
`none ready then
`round robin
`ports with lo
`priority data
`
`Slot 15
`If Fair Mode
`round robin
`among ports
`else round robin
`all ports with hi
`priority data till
`none ready then
`round robin
`ports with lo
`priority data
`
`Slot 16
`
`Slot 17
`
`COSO r\ PORTO
`: MDRR
`'COSA
`• 1 ..)
`•
`•
`•
`
`•
`COSO r\PORT15
`: MDRR
`KOS7
`
`•
`•
`•
`
`leo.%(-) PORTO
`
`MDRR
`COS71U
`
`•
`•
`•
`
`COSO r\PORT15
`MDRR
`COS7
`
`COSO (~) Mutticast
`MDRR
`COS7
`
`CPU
`
`COSO
`: MDRR
`COS7
`
`FIG. 18
`
`Q
`e
`u
`e
`
`gn
`
`
`
`a
`
`d e
`
`
`f
`
`U
`
`Cloudflare - Exhibit 1012, page 19
`
`Cloudflare - Exhibit 1012, page 19
`
`
`
`Waled *S11
`
`1£ Jo 61 WINS
`
`iff LtZ`L,89`9 Sfl
`
`Channel Queues (8 Deep)
`Hi Priority Channel 0 Queue I
`Lo Priority Channel 0 Queue I
`
`Hi Priority Channel 1 Queue I
`Lo Priority Channel 1 Queue I
`
`Hi Priority Channel 2 Queue 1
`
`Lo Priority Channel 2 Queue I
`
`•
`•
`•
`Hi Priority Channel 18 Queue
`Lo Priority Channel 18 Queue
`
`I
`
`Round robin
`among non-empty
`slots. Dequeue up
`to 512 bytes for
`slots "current"
`Read Request.
`If Fair Mode select
`current Read
`Requests
`alternately from hi
`and low priority
`channel queues. If
`Low Delay Mode
`only select low
`priority Requests if
`no high priority
`requests are
`pending in any
`channel.
`
`Channel Requests I.
`Valid/Channel Queue
`Full Status Bits
`
`Update Read Request
`and Status Bit
`
`Output
`Queue
`Read
`Requests
`from MDRR
`
`Channel
`Queues
`Full
`Status to
`MDRR
`
`Update Q Head, Length,
`MDRR Deficit when Read
`Request Complete
`
`I
`
`FIG. 19
`
`Send 8 DRAM
`bank read
`addresses to
`DRAM
`Controller
`every 200ns
`
`DRAM reads in
`progress info
`
`8 Cells Read from
`DRAM every 200ns
`
`/40••••••• •" *•..•
`
`7. fer•••••
`
`,40•••••
`
`\sss.41
`
`.
`
`Post Read
`Output
`Processing Logic
`
`Count bytes
`and cells for
`Read Requests
`
`Data Output to
`FIA, PLIM, CPU
`
`Cloudflare - Exhibit 1012, page 20
`
`Cloudflare - Exhibit 1012, page 20
`
`
`
`Waled *S11
`
`i£ JO OZ laNS
`
`iff LtZ`L,89`9 Sfl
`
`Adr[23:3]
`
`Adr
`Cells_left
`Cells to_read
`0 Ilqst ID
`etc. Read Descriptors
`
`Fits
`
`?
`A Lccounting
`
`
`
`Request
`State
`End of Block Adr Update
`Cells DRAM Data
`
`Large
`i
`Post Read
`\
`Request
`Output
`Complete!
`Processing Logic
`
`R qst ID
` Count bytes
`
`r,9r21.
`Ignore
`RUM Requests,
`Read
`fl
`1
`t
`0 Head 0 Length
`Data OIAutput to
`MDRI4 Updates
`F
`
`M r
`Cells
`Left
`ID
`Misc
`State
`
`
`
`Adr+Cells to_read
`Celts_left-Cells to read
`
`
`
`Channel/Pri (Write Adr)
`
`Channel Queues
`8 reqs for hi pri channel 0
`8 reqs for hi pri channel 1
`
`% Hi priority requests
`
`8 reqs for hi pri channel 15
`8 requests for hi pri mcast
`8 requests for hi pri CPU
`
`8 reqs for to pri channel 0
`8 reqs for lo pri channel 1
`
`% Low priority requests
`
`8 reqs for lo pri channel 15
`8 requests for lo pri mcast
`8 requests for to pri CPU
`(Read Adr)
`
`MDRR
`Read
`Request
`0/Pri/
`Adr/Len
`
`Channels
`full to
`MDRR
`1=1
`Valid & Small
`Status bits.
`One per
`request: 2x8x36
`1=1
`
`FIA or PLIM
`Channel Back-
`pressure.1x36
`
`Cloudflare - Exhibit 1012, page 21
`
`Bank Usage Logic
`Check if banks used by Request conflict with
`other banks in use to group multiple requests
`together.
`End of Block Adr
`Detect
`Adr[7:0I
`Cells_left
`
`Bank Select FIG. 20
`Address Select
`-0-Cells_to_read
`
`To DRAM
`Controller
`..E;)
`
`(- Bank Address
`Calculation
`Adr[23:81 pass thru
`Adr[7:3] pass thru or
`add 1 to selected
`banks.
`
`Read Scheduling State
`Machine
`Select next request to process
`round robin based on Priority, FIA
`back pressure and status bits.
`Read 9 additional Small Requests
`for possible piggyback with read.
`
`
`
`Cloudflare - Exhibit 1012, page 21
`
`
`
`Waled *S11
`
`1£ Jo It laNS
`
`iff LtZ`L,89`9 Sfl
`
`Send 8 DRAM
`bank read
`addresses to
`DRAM
`Controller
`every 200ns
`
`Channel Queues (8 Deep)
`Hi Priority Channel 0 Queue
`Lo Priority Channel 0 Queue
`
`Hi Priority Channel 1 Queue
`Lo Priority Channel 1 Queue
`
`Hi Priority Channel 2 Queue
`Lo Priority Channel 2 Queue 1
`
`•
`•
`•
`
`Hi Priority Channel 18 Queue
`Lo Priority Channe118 Queue
`
`Round robin
`among non-empty
`slots. Dequeue up
`to 512 bytes for
`slot's "current"
`Read Request.
`If Fair Mode select
`current Read
`Requests
`alternately from hi
`and low priority
`channel queues. If
`Low Delay Mode
`only select low
`priority Requests if
`no high priority
`requests are
`pending in any
`channel.
`
`FIG. 21
`
`Cloudflare - Exhibit 1012, page 22
`
`Cloudflare - Exhibit 1012, page 22
`
`
`
`Waled *S11
`
`i£ JO ZZ laNS
`
`iff LtZ`L,89`9 Sfl
`
`Bank Usage Logic
`Bank Select
`Check if banks used by Request conflict with Address Select
`other banks in use to group multiple requests
`together.
`Cells to_read
`End of Block Adr
`Detect
`Adr[7:01
`Cells_left
`
`DRAM
`DRAM
`
`Read Ad r
`Setup Controller
`
`Adr
`Cells
`Left
`ID
`Msc
`State
`
`
`
`Adr+Cells_to_read
`Cells left Cells to read
`
`
`
`Bank Address
`Calculation
`Adr[23:8] pass thru
`Adr[7:3] pass thru or
`add 1 to selected
`banks.
`
`Adr[23:3]
`
`Adr
`Cells_left
`Cells to read
`Q ilqst ID
` etc
`Read Descriptor
`Fifos
`
`14.10!INI
`
`Read Scheduling State
`Machine
`Select next request to process
`round robin based on Pnorityl FIA
`back pressure and status bits.
`Read 8 additional Small Requests
`for possible piggyback with read.
`
`F.G. 22
`
`Channel Queues
`
`8 reqs for hi pri channel 0
`8 reqs for hi pri channel 1
`
`% Hi priority requests
`
`8 reqs for hi pri channel 15
`8 requests for hi pri mcast
`8 requests for hi pri CPU
`
`8 reqs for lo pri channel 0
`8 reqs for lo pri channel 1
`
`% Low priority requests
`
`8 reqs for lo pri channel 15
`8 requests for lo pri mcast
`8 requests for lo pri CPU
`Read
`Adr
`
`Cloudflare - Exhibit 1012, page 23
`
`Cloudflare - Exhibit 1012, page 23
`
`
`
`U.S. Patent
`
`Feb. 3, 2004
`
`Sheet 23 of 31
`
`US 6,687,247 B1
`
`Not Address
`Setup Time
`
`Next
`Cyde
`
`Clock valid
`addresses
`into DRAM
`Controller.
`Invalidate
`banks in use.
`
`Select next slot!
`channel based on
`priority/round
`robin. Wait For
`Start of Address
`Setup Time.
`
`Not Address Setup Time
`
`No Requests
`ready &&
`Adckess
`Setup Time
`
`Address
`Setup Time
`
`&
`
`setup setup current
`Request being
`processed for
`slot.
`
`Not Address
`Setup Time
`
`Address
`Setup Time
`
`Select
`Additional
`Small
`Request for
`grouping
`
`First
`Request was
`Setup &&
`Address
`Setup Time
`
`FIG. 23
`
`Cloudflare - Exhibit 1012, page 24
`
`Cloudflare - Exhibit 1012, page 24
`
`
`
`Waled *S11
`
`i£ JO 17Z laNS
`
`iff L,17Z`L89`9 Sfl
`
`Read
`Cycle
`Channel
`Number
`Queue
`Number:
`
`Data
`Cells:
`DRAM
`Bank:
`
`0 0
`
`0 0
`
`0
`
`0
`
`0
`
`0
`
`0
`
`0
`
`0
`
`0
`
`0
`
`0
`
`0
`
`0
`
`1 1 1 1 1 1 1
`
`2 2 2 2
`
`4
`
`4 4 4 4
`
`4
`
`4
`
`0 0 0 0
`
`3
`
`8
`
`14 14 14 14 14 14 14 14
`
`307 307 307 307 307 307 307
`
`14 14 14 14
`
`514
`
`Hdr Data Data Data Data Hdr Data Data
`
`Data Hdr Hdr Data Data Hdr Data
`
`Data Hdr Data Data
`
`5 6
`
`7
`
`0
`
`1
`
`2
`
`3
`
`4
`
`3
`
`4 5 6 7
`
`0 1
`
`5 6 7 0
`
`7
`
`FIG. 24
`
`Cloudflare - Exhibit 1012, page 25
`
`Cloudflare - Exhibit 1012, page 25
`
`
`
`Waled *S11
`
`i£ JO SZ 13311S
`
`iff LitZ`L89`9 Sf1
`
`I
`
`Packet Buffer
`100 MHZ DDR SDRAM
`(4 x DIMM)
`
`Queue Status Interface -•—i-
`
`40-1- Processor Interface
`
`FIA192 Interface -•—•-
`
`-0—•- PLIM Interface
`
`Tx ASIC
`
`2
`
`External MAC
`100 MHZ SRAM
`(2 x 32bit)
`
`Multicast
`Group ID
`100 MHZ SRAM
`(1 x 32bit)
`
`FIG. 25
`
`Cloudflare - Exhibit 1012, page 26
`
`Cloudflare - Exhibit 1012, page 26
`
`
`
`U.S. Patent
`
`Feb. 3, 2004
`
`Sheet 26 of 31
`
`US 6,687,247 B1
`
`Control
`—)0 Statistics
`"410°` Data
`
`FIA192
`Interface
`Plaw.
`
`Input
`Synchronizer
`
`Multicast
`Multicast
`Duplication 44110- Group ID
`SRAM 1 Mbybs
`(1 x 32bit
`1 t
`Packet
`Insertion
`
`External MAC MAC
`SRAM 2 Mbyte
`-41-110-
`(2 x 32bit)
`
`MAC
`Rewrite
`
`Queue
`Status
`Interface
`
`Queue- 14
`Status
`
`Manager
`
`RED
`
`-
`
`Processor
`Interface gal
`
`MCC
`Memory
`Sub-System
`
`Packet Buffer
`00 MHZ DDR SDRAM
`(2 x DIMM)
`
`FIG. 26
`
` 01 Packet
`
`Accounting
`
`AL
`
`Output
`Rate Pacer
`
`.! PLIM
`Interface
`
`Cloudflare - Exhibit 1012, page 27
`
`Cloudflare - Exhibit 1012, page 27
`
`
`
`Waled *S11
`
`';11F
`
`O
`O
`
`i£ Jo LZ WINS
`
`iff LtZ`L,89`9 Sf1
`
`Unicast
`FIFO
`(128 bytes)
`x16
`
`Input
`Ir._
`Synchronizer:
`row.
`Interface
`
`FFIA
`Interface
`
`Multicast
`Group ID FIFO
`(32 entries)
`it
`MGID Fetch
`Engine
`
`Multicast
`Group ID
`SRAM 1- Mbyte
`(1 x 32bit)
`
`Control
`Statistics
`-11- Data
`
`Output
`Rate
`Controller
`
`**--- MAC Rewrite
`-iv. Interface
`
`Multicast
`FIFO
`(10k bytes)
`
`s
`Multicast
`Duplication
`Engine
`A
`
`Packet
` Accounting
`Module
`
`FIG. 27
`
`Cloudflare - Exhibit 1012, page 28
`
`Cloudflare - Exhibit 1012, page 28
`
`
`
`U.S. Patent
`
`Feb. 3, 2004
`
`Sheet 28 of 31
`
`US 6,687,247 B1
`
`edMulticast i 4
`
`
`piDuplication ,
`
`Data
`Pipe
`
`01 Internal
`4 400,1
`
`Packet
`Decoder
`
`MAC DMA
`(256 byte SRAM)
`
`_40.1Packet
`40.1 MAC
`
`Assembly
`
`Assembly
`
`RED
`, Module
`
`Packet
`- ;Accounting
`Module
`
`External
`MAC DMA
`
`External MAC
`SRAM 2 ltkyte
`(2 x 32 bit)
`
`FIG. 28
`
`Cloudflare - Exhibit 1012, page 29
`
`Cloudflare - Exhibit 1012, page 29
`
`
`
`U.S. Patent
`
`Feb. 3, 2004
`
`Sheet 29 of 31
`
`US 6,687,247 B1
`
`tvillticast
`Duplication
`
`Data
`Pipe
`
` MCC Memory
` Subsystem
`
`RED Drop
`Calculator
`
`1
`
`111111111.
`
`RED
`Statistics
`
`Instantaneous
`Q Depth
`
`Average
`Q Depth
`
`Packet
` Accounting
`Module
`
`Memory Subsystem
`
`FIG. 29
`
`RED Module
`Interface
`1-110-
`
`Data Path
`Pipeline
`
`MCC Memory
`Subsystem
`Interface
`
`CAR Rule
`Lockup
`
` 1 .I.P.VIOW
`Multicast
`a: ::I:€,
`y: Token Bucket L.,..0 ,n Accounting
`—
`1111" -,' Module
`,
`Module
`vnietee- &mi.*
`
`CAR Statistics
`Module
`
`FIG. 30
`
`Cloudflare - Exhibit 1012, page 30
`
`Cloudflare - Exhibit 1012, page 30
`
`
`
`Waled *S11
`
`;':11
`O
`
`O
`
`1£ Jo 0£ laNS
`
`iff LitZ`L,89`9 Sfl
`
`Outpurort #
`
`Priority
`oh
`Encoder Rule #
`
`Token
`Bucket #
`
`Output Port Number
`
`2k bit
`SRAM 128 bits
`
`4 bits
`
`TAG / IP TOS
`
`of SRAM 1
`oP
`5 bits Q'vliv1 128 bits
`
`Source AS label
`
`- ow
`7 bits
`
`16k bit
`SRAM
`
`ow
`/
`128 bits
`
`Destination AS label
`
`r aw116k
`bit
`7 bits
`SR
`AM
`
`128 bits
`
`FIG. 31
`
`Cloudflare - Exhibit 1012, page 31
`
`Cloudflare - Exhibit 1012, page 31
`
`
`
`Waled *S11
`
`1£ Jo 1£ laNS
`
`iff LitZ`L89`9 Sfl
`
`Prefetch
`Buffers
`0
`1
`2
`
`Prefetch
`Buffers
`
`0
`
`1
`
`2
`
`SALSA
`
`MCC
`
`CPU
`ilf fifo
`12 kbytes
`
`MCC
`Packet
`Memory
`
`FIG. 32
`
`..J
`
`CPU
`
`Cloudflare - Exhibit 1012, page 32
`
`Cloudflare - Exhibit 1012, page 32
`
`
`
`1
`ARCHITECTURE FOR HIGH SPEED CLASS
`OF SERVICE ENABLED LINECARD
`
`US 6,687,247 B1
`
`5
`
`10
`
`BACKGROUND OF THE INVENTION
`
`1. Field of the Invention
`The present invention relates to communications devices,
`specifically linecards for interfacing communications
`devices to networks.
`2. Description of the Related Art
`In a communications network, routing devices receive
`messages at one of a set of input interfaces and forward them
`on to one of a set of output interfaces. Users typically require
`that such routing devices operate as quickly as possible in
`order to keep up with the high rate of incoming messages.
`In a packet routing network, wherein information is trans-
`mitted in discrete chunks or "packets" of data, each packet
`includes a header. The header contains information used for
`routing the packet to an output interface and subsequent
`forwarding to a destination device. The packet may also be
`forwarded to another router for further processing and/or
`forwarding. Header information used for routing may
`include the destination address and source address for the
`packet. Additionally, header information such as the desti-
`nation device port, source device port, protocol, and packet
`priority may be used. Header information used by routing
`devices for administrative tasks may include information
`about access control, accounting, quality of service (QoS),
`or class of service (CoS).
`FIG. 1 is a generic routing system 100 that will be used
`to describe both the prior art and the invention. A well-
`known routing device or system 100 consists of a set of
`linecards 110 and a switching fabric 120. Each linecard 110
`includes an input interface 111, an output interface 112, a
`fabric interface 170, and a control element 130. Linecards
`110 connect to communications network 1, which may be
`any form of local, enterprise, metropolitan, or wide area
`network known in the art, through both input interface 111
`and output interface 112.
`Control element 130 is configured to receive inbound
`packets 113 (i.e., packets entering the system from network
`1) from input interface 111, process the packet, and transmit
`it through fabric interface 170 to switching fabric 120 for
`further processing by the same or another control element
`130. Outbound packets 114 are received from switching
`fabric 120 through fabric interface 170, processed in control
`element 130, and transmitted to network 1 on output inter-
`face 112.
`Control element 130 consists of an inbound packet
`receiver 140, lookup circuit 145, inbound memory controller
`150, first memory 160, fabric interface 170, outbound
`memory controller 150, second memory 160, and outbound
`transmitter 180. Control circuits 190 are also provided to
`perform statistics collection and accounting functions as
`well as to process certain exception packets.
`In a manner well-known in the art, packets are received
`from the physical medium of the network at input interface
`111. The inbound packet receiver 140 operates in conjunc-
`tion with lookup circuit 145 to determine routing treatments
`for inbound packets 113. Lookup circuit 145 includes rout-
`ing treatment information disposed in a memory data struc-
`ture. Access and use of this information in response to data
`in the header portion of inbound packet 113 is accomplished
`with means well-known in the router art. These routing
`treatments can include one or more of the following:
`
`2
`a) selection of one or more output interfaces to which to
`forward inbound packets 113 responsive to the desti-
`nation device, to the source and destination device, or
`to any other packet header information;
`b) determination of class of service (CoS) treatment for
`inbound packets 113;
`c) determination of one or more accounting records or
`treatments for inbound packets 113; and
`d) determination of other administrative treatment for
`inbound packets 113.
`Examples of such systems may be found in U.S. Pat. Nos.
`5,088,032, METHOD AND APPARATUS FOR ROUTING
`COMMUNICATIONS AMONG COMPUTER NET-
`15 WORKS to Leonard Bosack; U.S. Pat. No. 5,509,006,
`APPARATUS AND METHOD FOR SWITCHING PACK-
`ETS USING TREE MEMORY to Bruce Wilford et al.; U.S.
`Pat. No. 5,852,655, COMMUNICATION SERVER APPA-
`RATUS HAVING DISTRIBUTED SWITCHING AND
`20 METHOD to John McHale et al.; and U.S. Pat. No. 5,872,
`783, ARRANGEMENT FOR RENDERING FORWARD-
`ING DECISIONS FOR PACKETS TRANSFERRED
`AMONG NETWORK SWITCHES to Hon Wah Chin,
`incorporated in their entireties herein by reference.
`One shortcoming known in the prior art arises from the
`ever-increasing need for speed in network communications.
`Attempts to scale prior art routers and switches to gigabit
`speed have shown that architectures that require a deep
`packet buffering prior to determining routing treatment
`30 suffer from high packet latency. Distributed routing
`schemes, such as that described above wherein routing is
`performed immediately on packet receipt in each linecard,
`have had only limited success in providing the necessary
`increase in throughput speed.
`35 A further drawback of prior art systems is their relative
`inability to rapidly provide a range of services based on
`packet priority, as represented by various fields in the packet
`header. Such systems are often described as providing type
`of service (TOS), quality of service (QoS), or class of
`40 service (CoS) routing. Prior art systems typically experience
`additional packet latency and throughput reduction when
`performing routing based on packet priority.
`What is needed is a router/switch system, preferably
`distributed on a linecard, that provides low latency packet
`45 routing based at least in part on packet priority. In particular,
`low latency priority routing determined by individual packet
`class of service is desired. Such a linecard should operate as
`close to line rate as possible, i.e., at or near the maximum
`speed of transmission over the physical medium and without
`any appreciable buffering delay.
`
`25
`
`50
`
`SUMMARY
`
`The present invention is a linecard architecture for high
`speed routing of data in a communications device. This
`55 architecture provides low latency routing based on packet
`priority because packet routing and processing occurs at line
`rate (i.e., at wire speed) for most operations. Comprised of
`an inbound receiver (including lookup and packet modifi-
`cation functions), queue manager, and outbound transmitter
`60 portions with associated network physical interfaces and a
`common device switching fabric, the architecture provides a
`distributed routing function with minimal packet delay.
`Packets arrive from the network via a physical medium
`interface, in one embodiment an 0C192 fiber optic connec-
`65 tion. Demodulation, deframing, and conditioning are per-
`formed by means well-known in the art to supply an OSI
`layer 3 packet data stream to the inbound receiver. The
`
`Cloudflare - Exhibit 1012, page 33
`
`Cloudflare - Exhibit 1012, page 33
`
`
`
`US 6,687,247 B1
`
`3
`inbound receiver uses a small, single packet FIFO to accu-
`mulate packet bytes very rapidly, at line rate. Once the
`header portion of the packet, in one embodiment defined as
`the first 60 bytes, is received, it is used to rapidly perform a
`routing lookup. The lookup data returned is then used to
`modify the packet header, and rate limiting and buffer
`management rules are applied to the packet. All of the above
`steps occur essentially at line rate, without the buffering-
`induced delay seen in the prior art.
`The queue manager uses the class of service information
`in the packet header to enqueue the packet according to its
`required priority, again at essentially line rate. Enqueued
`packets are buffered in a large memory space holding
`multiple packets prior to transmission across the device's
`switch fabric (interconnect) to the outbound linecard.
`On arrival at the outbound linecard, the packet is (in one
`embodiment of the present invention) rate limited and
`enqueued in the outbound transmitter portion of the linecard
`architecture. A large, multi-packet memory structure, as
`employed in the inbound queue manager, provides buffering
`prior to transmission onto the network via an appropriate
`physical layer interface module.
`
`BRIEF DESCRIPTION OF THE DRAWINGS
`The present disclosure may be better understood, and its
`numerous objects, features, and advantages made apparent
`to those skilled in the art by referencing the accompanying
`drawings.
`FIG. 1 is a high-level schematic representation of a
`router/switch system that contains prior art circuitry or the
`circuit/process of the present invention.
`FIG. 2 is a high-level schematic of linecard control
`element 130 according to one embodiment of the present
`invention.
`FIG. 3 is a high-level schematic of a portion of the
`inbound data path according to one embodiment of the
`present invention.
`F