throbber
A Fast Track Architecture for UDP/IP and TCP/IP
`
`Roy Chua
`rchua@eecs.berkeley.edu
`
`MacDonald Jackson
`treyécs.berkeley.edu
`
`Marylou Orayani
`marylou€cs.berkeley.edu
`
`May9, 1995
`
`Abstract
`
`Poor TCP/IP and UDP/IP implementations have been attributed as a factor to the observed lack of
`performance of remote applications given the speeds of modern communication networks. We discuss
`someof the factors that cited as contributors to the protocols’ poor performance. Wethen discussthe fast
`track architecture for UDP/IP and TCP/IP which improves protocol performance by optimizing for the
`commonrather than the general case and by bypassing the operating system completely thereby reducing
`any overhead that maybe incurred whilein the operating system. Wethen present our implementation of
`a fast track UDP/IP along with proposals for fast track TCP/IP implementations.
`
`1
`
`Introduction
`
`Many researchers have observed that the performance of remote applications have not kept pace with
`modern communication network speeds. Part of this imbalance is attributed to the protocols used by
`the remote applications, namely, UDP/IP and TCP/IP. It is now widely believed that the problems with
`these protocols are not inherent in the protocols themselves but in their particular implementations |Dalt93,
`Whet95]. The following factors are cited as contributors to the poor performance of UDP/IP and TCP/IP:
`
`Multiple copying of data This is considered the biggest bottleneck. Receiving a packet requires two copies:
`one copy from the network interface to the kernel and another one from the kernel to the receiving
`application. Sending a packet also requires two copies: one from the sending application to the kernel
`and anotherone from the kernel to the network interface. Measurementsin [Kay93] show that moving
`packet data takes up 9%of TCP processing time and 16%of UDP processing time.
`
`Protocol layering Layering facilitates protocol design but not implementation. It causes bad buffering deci-
`sions which generate unnecessary memorytraffic [Jacob93]. Packets are forced to go from application
`buffers to socket buffers to kernel buffers and finally to the network interface buffers. Layering also
`hinders parallelism [Jacob93] forcing applications to call generic routines instead of protocol-specific
`ones causing a bottleneck at the generic routines.
`
`Optimization for the general rather than the common case Because of the generality of the code,alot of
`time is spent processing packet headers, calculating checksums in the kernel and managing buffers
`[Whet95]. This was acceptable due to the unreliability of the internetworks within which the proto-
`cols were operating but with modern networks, most packets arrive in sequence and are error-free
`[Jacob93].
`
`In addition, it requires an
`Checksum computation It is known that checksum calculation is expensive.
`additional perusal of a packet's header and data. Measurements in [Kay93] showthat checksumming
`of TCP/IP packets take up only 7%of TCP processing time while it takes up 25% of UDP processing
`time. This is attributed to the larger sizes of UDP packets. Analysis of a FDDI networktraffic in a
`university LAN [Kay93] calculated a median TCP packet size of 32 bytes and a median UDP packet
`size of 128 bytes.
`
`DEFS-ALA0007 163
`Alacritech, Ex. 2032 Page 1
`
`Alacritech, Ex. 2032 Page 1
`
`

`

`Kernel buffer manipulation The kernel buffers used for network packet processing are the memory buffers
`(mbufs) and the socket buffers. According to [Kay93], mbuf and socket buffer processing take up 18%
`of TCP processing time and 15% of UDP processing time.
`
`Context switches A context switch involves a process blocking itself (e.g., via the sleep system call),
`waking up another process and restarting it. According to Kayet. al. [Kay93], context switches take
`up about 5%of TCP processing time and 4%of UDP processing time [Kay93].
`
`Error checking Error checking timeis largely taken up by the validation of argumentsto system calls. This
`takes up 5%of TCP processing time and 7% of UDP processing time.
`
`Interrupts When a packet arrives, two interrupts are generated: a hardware interrupt generated by the
`network interface to indicate that a packet has arrived and a software interrupt generated by the
`network interface device driver to tell IP that a new packet has arrived. Generating a software
`interrupt and handling the interrupt by dequeueing the incoming packet and calling IP to handle it
`takes up 2%of processing time for both TCP and UDP.
`
`In [Whet95], a fast track architecture for UDP/TP and TCP/IP was proposed. Its main aim is to improve
`protocol performance by optimizing for the commoncase and by bypassing the operating system (OS)
`completely, reducing any overhead that may be incurred while in the OS. Given networkinterface support,
`the fast track can also limit the copying of data to one. The architecture distinguishes between two types
`of packets: commoncase and non-commoncase (commoncasecriteria are discussed in detail in Section3)
`and only processes the commoncase packets. A packet buffer is shared between the application and the
`network interface device driver. When a commoncase packet arrives at the receiving host, the device driver
`copies the packet into the packet buffer it shares with the application. This effectively demultiplexes the
`packet directly to the application and bypasses the operating system completely. To send a common case
`packet, the sending application directly copies the packet into the buffer it shares with the device driver
`and calls a device driver routine to send the packet to its destination.
`The fast track architecture is applicable to a wide range of network interfaces [Whet95] and can coexist
`with other UDP/IP and TCP/IP implementations.
`Section 2 discusses problems reported and solutions proposed by researchers in the area. Section
`3 presents the goals of the fast track architecture. Section 4 describes the fast track implementation of
`UDP/IP along with the current UDP/IP implementation. Section 5 is a discussion of fast track and the
`current TCP/IP implementation. Section 6 is an analysis of the fast track UDP /IP implementation. Section
`7 presents difficulties we encountered during implementation. Section 8 presents our conclusions and
`summary.
`
`2 Previous Work
`
`Various researchers have proposed and implemented solutions to overcome the problems with UDP/IP
`and TCP/IP implementations discussed in Section 1. In [Part94], Craig Partridge showed waysto speed up
`UDP/IP implementation one of which is to calculate the checksum while copying the data. Van Jacobson
`[Jacob93] has implemented a high performance TCP/IP by reducing the numberof interrupts generated for
`each packet sent or received, by touching anypiece of data exactly once, and by doing away with layering
`and maximizing parallelism. His code only requires 37 instructions and 7 memory references to forward
`an IP packet, and less than 60 instructions and 22 memoryreferences to process a TCP packet.
`Daltonet. al. [Dalt93] modified TCP/IP to support the Afterburner card which they designed andbuilt.
`The card provides buffers that can be directly accessed by the operating system so that data is copied only
`once: from the on-card buffers to the application buffers for receiving and vice-versa for sending. They
`achieved a throughput of 210 Mbps with 14 KB packets.
`Brustoloni [Brust94] used exposed buffering to avoid copying of data altogether. A special area in kernel
`memoryis allocated specifically for these exposed buffers so that both the applications and the network
`interface can share access to them. When a packetis received, it is placed directly in these buffers and the
`
`DEFS-ALA0007 164
`Alacritech, Ex. 2032 Page 2
`
`Alacritech, Ex. 2032 Page 2
`
`

`

`application reads the packet directly from them. Tosend a packet, the application writes into the buffers and
`the networkinterface sendsit from the same buffers. This was implemented in the Mach operating system
`on a host connected to an ATM LAN. Theyachieved a throughputof 48.4 Mbps with 64 KB datagrams and
`48.5 Mbps with 64 KB sequenced packets.
`Kayet. al. [Kay93] suggested avoiding checksumming for both TCP and UDP whenthe communicating
`hosts are on the same LAN provided the LAN supports a hardware CRC. Theyalsostated that reducing TCP
`protocol-specific processing time is useful but will require a major modification of TCP implementation.It
`is unclear whetherthe effort required is worth the performance improvementthat will be gained.
`Whetten et. al.
`[Whet95] proposed an implementation for a fast track receiver and sender for both
`TCP/IP and UDP/IP. Weinitially tried to implement the fast track TCP/IP receiver but encountered
`difficulties with the proposed implementation. The proposal did not take into accountthat for both the fast
`track and non-fast track TCP/IP implementations to coexist, they must be able to share state information
`about TCP/1P connections. We had a few options:
`
`« Removethe requirementthatfast track and non-fast track TCP/1P must coexist.
`This is infeasible since fast track setup requires assistance from non-fast track TCP/IP. In addition,
`non-common case TCP/IP packets will never be unprocessed.
`
`¢ Modify TCP/IP to be able to more readily share state information with fast track TCP/IP.
`Given the complexity of TCP/IP code, we believed this would require too much work given the
`amountof time we had.
`
`e Implementfast track UDP/IP instead.
`This is the option we eventually chose because of the simplicity of UDP/IP relative to TCP/IP.
`
`In the remaining sections of the paper, we first present the fast track goals for both UDP/IP and
`TCP/IP and follow this with a discussion of fast track UDP/IP architecture and implementation. This
`is then followed by a discussion of the fast track TCP/IP architecture and a number of proposals for its
`implementation. We then present an analysis of our fast track UDP/IP implementation.
`
`3 Fast Track Goals
`
`For both UDP/1P and TCP/IP,the goals of the fast track architecture are to optimize for the commoncase
`(vs. the general case), to minimize operating system overhead and to minimize data copying. A UDP/IP
`packet is considered commoncase for the receiverif it satisfies the followingcriteria:
`
`e It uses the correct IP version (4).
`
`e It has the correct header length (5).
`
`e Itis not fragmented.
`
`¢ It is destined for this host.
`
`It is error-free.
`
`A socket already exists for the packet.
`
`e e
`
`Thecriteria fora TCP/IP packet are identical with the following additions:
`
`e It is not a control packet.
`
`e It has the correct sequence number(i.e., the packet arrived in order).
`
`e It has no variable length options.
`
`DEFS-ALA0007165
`Alacritech, Ex. 2032 Page 3
`
`Alacritech, Ex. 2032 Page 3
`
`

`

`
`
`OS Software
`Interrapt
`Level
`
`OS Hardware
`Lev
`Interrupt
`
`Network
`Interface
`Level
`
`Socket
`Recesve
`Buffer
`2
`
`Socket
`Receive
`
`Network
`Interface
`t
`
`Network
`Interface
`2
`
`ute
`
`Network
`Interface
`a
`
`Figure 1: Current UDP /IP Receiver
`
`The criteria for a common case UDP/IP packetfor the senderare:
`
`e No packets waiting to be sent.
`
`e No fragmentation required.
`
`e Already have an outgoing socket.
`
`e Havealready resolved the outgoing link address using ARP
`
`The criteria for sending a TCP/IP packet are identical with the following additions:
`
`e Windowsize must be big enough and havespace.
`
`e Must have a valid connection
`
`Thoughtreated separately in this paper, fast track UDP/IP and TCP/IP can and should be combined in
`one implementation. The separation was donefor simplicity and clarity.
`
`4 Fast Track UDP/IP
`
`In this section, we will first describe the current implementation of the UDP/IP receiver and then contrast
`it with the fast track implementation. This is followed bya similar discussion of the UDP/IP sender.
`
`DEFS-ALA0007 166
`Alacritech, Ex. 2032 Page 4
`
`Alacritech, Ex. 2032 Page 4
`
`

`

`Application
`Receive
`Buffer
`
`Application
`Receive
`Buffer
`
`ee
`
`Reeation
`Buffer
`n
`
`OS Hardware
`Tnterrupt
`Level
`
`
`
`Network
`i
`Interface
`
`2
`Interface
`
`ae
`
`Network
`A
`Interface
`
`Inreriacs
`Level
`
`Figure 2: Fast Track UDP /IP Receiver
`
`4.1 Current UDP/IP Receiver
`
`Figure 1 showsa diagram of the current UDP/IP receiver. When a packetarrives at the networkinterface, it
`generates a hardware interrupt which activates the interface’s device driver. If the packet is an IP packet, the
`driver puts itinto the IP input queue, ipintrg, and generates a software interrupt to inform the IP interrupt
`handler, ipintr, that a new IP packet has arrived. Based on the packet's protocol type (specified in the IP
`header), IP demultiplexes the packet to the appropriate protocol. If itis a UDP packet, ipintr invokes the
`UDPinput function, udp_input, to process the incoming packet. udp_inputfirst validates the incoming
`packet byverifying its length and by calculating its checksum. udp_input then determines the packet's
`recipient application and puts the packet into that application’s socket receive buffer. Two interrupts are
`generated, the hardwareinterrupt generated by the network interface and the software interrupt generated
`by the device driver. One can clearly see the bottleneck at ipintr and udp_input due to layering.
`
`4.2 Fast Track UDP/IP Receiver
`
`Weimplemented the receiver on a HP-UX 9000/700 workstation running the HP-UX OS version A.09.01.
`Figure 2 shows a diagram of the fast track UDP/IP receiver. When a packet arrives at the network
`interface, it generates a hardware interrupt which activates the interface’s device driver. The driver then
`calls the fast track demultiplexer, £t_demux, which determines if the packet is common case or not. If it
`satisfies all the commoncasecriteria enumerated in Section 3 above, ft_demux copies the packet from the
`network interface to the appropriate application receive buffer (which replaces the socket receive buffer).
`Otherwise, the packet is placed in the appropriate protocol input queue and the appropriate software
`interrupt is generated. Only one interrupt is generated, the hardware interrupt generated by the network
`interface. Except for the brief stay in the demultiplexer, the operating system is completely bypassed. In
`addition, no bottlenecks exist.
`
`DEFS-ALA0007167
`Alacritech, Ex. 2032 Page 5
`
`Alacritech, Ex. 2032 Page 5
`
`

`

`(4 bytes)
`
`Application
`eceive
`Buffer
`Pointer
`
`Virtual
`Address
`Space
`Pointer
`
`(4 bytes)
`
`Destination
`Port
`
`Virtual
`Space Id
`
`(2 bytes)
`
`(4 bytes)
`
`Figure 3: Demultiplexer Table Entry Format
`
`create socket
`
`while more packets to receive
`call ft_recvfrom
`
`Figure 4: Fast Track Application Server Algorithm
`
`4.2.1 Fast Track Demultiplexer Table
`
`The key componentof the fast track receiver is the fast track demultiplexer, ft.demux. The key data
`structure used by £t_demux is the demultiplexer table, DemuxTable. Each socket that uses the fast track
`has an entry in DemuxTable. Each entry contains the port used bythe receiving application (destination
`port), the application’s virtual space id, a pointer to its virtual address space, and a pointerto its receive
`buffer. The destination port essentially identifies the receiving application because each server socket can
`only be used by one server. The virtual space id and pointer are needed by the demultiplexer to correctly
`access the application’s receive buffer. The receive buffer replaces the socket receive buffer to minimize(if
`not eliminate) the use of kernel buffers. Figure 3 showsthe formatof an entry and the numberof bytes each
`field uses. £t_demux also uses hint pointers to the most recently used entries in the demultiplexer table to
`take advantageof the fact that most packets demultiplex to the same piace [Jacob93].
`
`4.2.2 Fast Track Recvfrom
`
`Toenable fast track UDP/IP receiving, the application only has to call £t_recvfrom instead of recvfrom.
`Calling ft_recvfrom is just like calling tt recvfrom except that an additional parameter is given, the
`application’s address as specified by bind. Two buffers are needed by an application to receive packets:
`a receive buffer used for queueing packets destined for the application and a packet buffer used to store
`a packet from the receive buffer. The receive buffer is a first-in- first-out (FIFO) buffer. The packetat the
`beginning of the buffer is moved fromthe receive buffer into the packet buffer. Applications canonly directly
`read packets from the packet buffer. Tominimize any additional work the applicationhas to do, space needed
`for the receive buffer is allocated by ft_recvfrom. The packet buffer is supplied by the application as a
`parameter to £t recvfrom (just like in recvfrom). Whencalled for thefirst time, £t_recvfrom creates
`and inserts an entry for the specified socket(the destination port) into the demultiplexer table. ft_recvfrom
`also calls recvfrom to receive the socket’s very first socket. Subsequentcalls to £t_recvfrom will first
`poll the application’s receive buffer until a packet is placed in it by £t.demux. For simplicity, polling is used
`instead of select to detect data in the receive buffer. Each packet placed by £t_demuxin the receive buffer
`can be one of two types: commoncase(fast track) or non-commoncase (slow track).
`If the packet at the
`beginning of the buffer is slow track, this means that actual packet data is not in the buffer since £t_demux
`does not process slow track packets. To process a slow track packet, ft_recvfrom calls recvfrom to
`receive the packet using the conventional operating system channels. If the packetis fast track, it is copied
`from the application’s receive buffer into its packet buffer. Figures 4 and 5 show the algorithmsfor the fast
`track application server and ft_recvfrom,respectively.
`
`DEFS-ALA0007168
`Alacritech, Ex. 2032 Page 6
`
`Alacritech, Ex. 2032 Page 6
`
`

`

`if called for the first time
`call recvfrom
`
`allocate space for receive buffer
`add an entry to demultiplexer table
`
`else
`
`poll receive buffer for data
`if slow track packet at head of buffer
`call recvfrom
`
`else
`
`copy packet into specified packet buffer
`return number of bytes received
`
`Figure 5: £t_recvfrom Algorithm
`
`4.2.3 Fast Track Demultiplexer
`
`When an incoming packet arrives, £t_demux is activated by the hardware interrupt generated by the
`networkinterface to indicate the packet's arrival. ft_demuxfirst checks that the packet is a UDP/IP packet.
`If it is not, itis simply sent back to the OS. Otherwise, £t demux checks that the packet contains the correct
`IP version number (4) and the correct header length (5). If it does not, it is sent back to the OS. Otherwise,
`ft_demux consults the demultiplexer table, DemuxTable, for an entry that has the same source/sender
`IP address as the incoming packet, the same source and destination ports as the incoming packet, and
`the same protocol (UDP) as the incoming packet. If a matching entry is found, the packet is copied into
`the application receive buffer specified in the DemuxTabl1le entry. To speed up searches, hint pointers are
`used to keep track of the most recently used table entries. These table entries are first consulted and if no
`matchis found using the hint pointers, the entire table is searched for matching entries. Figure 6 showsthe
`ft_demux algorithm.
`
`4.3 Current UDP/IP Send
`
`Figure 7 shows a diagram of the current UDP/IP sender. When a process, called a UDP Client, sends a
`packet using the system call sendto, the destination socket address structure and user data are copied into
`an mbuf chain. This mbuf chain is then passed to the protocol layer, first udp.output, then ip_output,
`which prepend the appropriate headers. udp_outputcalls if.output to pass the datagram to the network
`interface layer, where the packet is placed in a queue to be transmitted to the network. Note that user data
`does not wait in any queues until it reaches the device driver.
`
`44 Fast Track UDP/IP Sender
`
`Weimplemented the sender on the same machineas the receiver, an HP-UX 9000/700. Figure 8 showsthe
`control flow of the fast track UDP/IP sender. The UDP Client sends a datagram bycalling ft.sendto. If
`the packet meets the commoncase conditions as described in section 3 above, the data is copied to a mbuf
`chain with the UDP and IP headers prepended, and the packet is sent directly to the network interface to
`be sent out to the network. If the packet fails any of the conditions control is passed to sendto.
`The fast track UDP code reduces some overhead by caching headerfields and cutting out two function
`calls. £Et_sendto has a MuxTable, which performs the samerole as the DemuxTable does on the receive
`side. The table is indexed by the UDP Client’s source and destination sockaddr’s. Port pairs alone will not
`resolve the ambiguity because the UDP Client may send to more than one destination through the same
`socket, and each of those destinations may use the same port. For the same reason, socketfile descriptors
`will not work. The locations of the sockaddr’s will be unique.
`
`DEFS-ALA0007 169
`Alacritech, Ex. 2032 Page 7
`
`Alacritech, Ex. 2032 Page 7
`
`

`

`if packet is not
`return to OS
`
`IP
`
`if packet is not UDP
`return to OS
`
`if packet’s IP version is incorrect
`return to OS
`
`if packet’s header length is incorrect
`return to OS
`
`for each hint pointer
`if tableEntry[hintPointer].destPort == packet.destPort
`copy packet into tableEntry[hintPointer].recvBuf
`
`endfor
`
`if no matches found
`
`for each table entry
`if tableEntry[entryNo].destPort == packet.destPort
`copy packet into tableEntry[hintPointer].recvBuf
`
`endfor
`
`if no matches found
`return to OS
`
`Figure 6: £t_demux Algorithm
`
`OS Software
`Interrapr
`Level
`
`i_putput
`
`
`
`Network
`Interface
`i
`
`vee
`
`et
`
`a
`
`k
`ce
`
`Network
`Interface
`Level
`
`Figure 7: Current UDP/IP Sender
`
`DEFS-ALA0007170
`Alacritech, Ex. 2032 Page 8
`
`Alacritech, Ex. 2032 Page 8
`
`

`

`
`
`Network
`Interface
`t
`
`Network
`Interface
`2
`
`ute
`
`OS Software
`Interrapt
`Level
`
`Network
`Interface
`a
`
`Network
`Interface
`Level
`
`Figure 8: Fast Track UDP /IP Sender
`
`Cached in the MuxTable are both the UDP and the IP headers. When ft_sendto receives a packet
`from a new UDPClient, the MuxTable entries are filled in, and the packet is sent to the network using
`sendto (because this is a special case of not meeting the fast track criteria), When a packetis received by
`£t.sendto from a known UDPClient, the headers are copied into the mbuf using the cached values. The
`only headers that need to be updated are the IP header checksum, the IP length, the UDP length, and the
`UDPchecksum. Currently the UDP checksum is not computed. See figure 9 for the algorithm.
`Another optimization is the assumption that there are no packets waiting to be sent in the network
`driver. This facilitates speedy data transmission. Currently the network interface does not support such a
`check.
`
`5 Fast Track TCP/IP
`
`In this section, wefirst give a brief description of the current implementation of the TCP/IP receiver and
`follow it with a general description ofthe fast track TCP/IP receiver. Wethen presenta numberof proposals
`on howafast track TCP/IP receiver can be implemented. This is followed by a similar discussion of the
`sender.
`
`5.1 Current TCP/IP Receiver
`
`Figure 10 is a diagram of the current implementation of the TCP/IP receiver. When a packet arrives at
`a network interface, that interface generates a hardware interrupt which activates its driver to process
`the incoming packet.
`If the packet is an IP packet, the driver puts it into the IP input queue, ipintra,
`and generates a software interrupt to activate the IP interrupt handler, ipintr.
`If the packet is a TCP
`packet, ipintr then calls the TCP input function, tep_input, to further process the packet. tep.input
`
`DEFS-ALA0007171
`Alacritech, Ex. 2032 Page 9
`
`Alacritech, Ex. 2032 Page 9
`
`

`

`if packet is not IP
`sendto(packet)
`if packet needs to be fragmented
`sendto(packet)
`
`for each hint pointer
`if ((MuxTable[{hintPointer].sorc == uap->from) &&
`(MuxTablefhintPointer].dest != uap->to))
`header = MuxTable[sendHints[i]].header;
`endfor
`
`if no matches found
`
`for each table entry
`if ((MuxTable[EntryNo].sorc == uap->from) &&
`(MuxTable[EntryNo].dest
`!= uap->to))
`header = MuxTable[sendHints[i]].header;
`endfor
`
`if no matches found
`add new entry
`
`if ARP resolved
`send to network interface
`
`else
`
`sendto(packet)
`
`Figure 9: £t_mux algorithm
`
`10
`
`DEFS-ALA0007172
`Alacritech, Ex. 2032 Page 10
`
`Alacritech, Ex. 2032 Page 10
`
`

`

`
`
`OS Software
`Interrapt
`Level
`
`OS Hardware
`Interrapt
`Level
`
`Network
`Interface
`Level
`
`Socket
`Recesve
`Buffer
`2
`
`Socket
`Receive
`Buffer
`
`Network
`Interface
`t
`
`Network
`Interface
`2
`
`ute
`
`Network
`Interface
`a
`
`Figure 10: Current TCP/IP Receiver
`
`first validates the incoming packet by calculating its checksum and then comparesit with the checksum
`included in the TCP header. It then calculates the amountof space available in the socket receive buffer
`(this available spaceis called the receive window) for the receiving application. The received packetis then
`trimmed (if necessary) to fit into the receive window. tep_input then queues the trimmed packet into the
`application’s socket receive buffer. The application obtains the packet from this receive buffer via one ofthe
`socket receive functions, e.g., recv and read. In addition to queueing the buffer into the appropriate socket
`receive buffer, tep_input performs somecritical functions to correctly maintain the connection’s state.
`These include updating the receive and congestion windowsif needed, determining the sequence number
`of the next expected packet, remembering the packets received but not yet acknowledged, and sending
`acknowledgements for received packets. This state information maintenance is what makes it much more
`complicated than UDP.
`the hardware
`In the process of receiving an incoming TCP/IP packet, two interrupts are generated:
`interrupt generated by the network interface and the software interrupt generated by ipintr. As in UDP,
`a bottleneck exists at ipintr and tcp_input dueto layering.
`
`5.2 Fast Track TCP/IP Receiver
`
`Implementation of the fast track TCP/IP recciveris structurally identical to its UDP/IP counterpart. Figure
`10 shows a diagram of the implementation structure.
`When a packet arrives at a networkinterface, it generates a hardware interrupt to activate its device
`driver so it can process the incoming packet. The driver then calls the fast track demultiplexer, ft. demux,to
`demultiplex the packets to the appropriate application receive buffer. As in thefast track UDP/IP receiver,
`the demultiplexer is the key componentof the receiver (see Section 4.3 for details).
`To receive fast track TCP/IP packets, the application must invoke the usual sequenceof calls: socket,
`bind, listen, and accept. accept creates a new socket for each new connection accepted.
`It is at
`
`11
`
`DEFS-ALA0007173
`Alacritech, Ex. 2032 Page 11
`
`Alacritech, Ex. 2032 Page 11
`
`

`

`Application
`Receive
`Buffer
`
`Application
`Receive
`Buffer
`
`Application
`Reecive
`Buffer
`n
`
`
`
`OS Hardware
`Tnterrupt
`Level
`
`Inreriacs
`Level
`
`Network
`i
`Interface
`
`2
`Interface
`
`ae
`
`Network
`A
`Interface
`
`Figure 11: Fast Track TCP/IP Receiver
`
`this point where a demultiplexer table entry is created for the new socket and added to the demultiplexer
`table. Instead of calling read or recv, the application must call £t.recv whichreceives fast track TCP/IP
`packets processed by ft.demux.
`ft_recv behavesjust like ft.recvfrom (used for UDP/IP packets)
`except that it has to perform additional functions to maintain state information for the connection and share
`this information with the non-fast track TCP/IP for both to function correctly. This makes ft_recv much
`more complicated than ft_recvfrom.
`
`5.3. Current TCP/IP Sender
`
`The control and data flow of the current implementation of the TCP/IP senderis shown figure 12. When a
`TCP Client calls send to send a packet, the data flows muchasit does in the UDP case. The data is copied
`to an mbuf chain. This mbuf chain is passed down to the network interface level through tep_output
`and ip.output, both of which prepend the proper headers. ip.output passes the mbuf to the network
`interface by calling if_output.
`Unlike UDP,sending data using TCP can resultin queueing user data before the network drivers. Because
`TCP is a connection-oriented service and is based ona sliding windowscheme, a large amountof state must
`be managedin order to guarantee the service it provides. tep_.usrreq is the function which handles this
`“bookkeeping”, including connection management, congestion windowsize, timers and associated timing
`computation, and data retransmission. For a detailed description of tep.usrreq see [Wrig95]. Due to
`this high overhead andto lost packets, user data can be queued at both the TCP and the networkinterface
`layers.
`
`5.4 Fast Track TCP/IP Sender
`
`12
`
`DEFS-ALA0007174
`Alacritech, Ex. 2032 Page 12
`
`Alacritech, Ex. 2032 Page 12
`
`

`

`
`
`Network
`Interface
`t
`
`Network
`Interface
`2
`
`OS Software
`Interrapt
`Level
`
`ute
`
`Network
`Interface
`a
`
`Network
`Interface
`Level
`
`Figure 12: Current TCP/IP Sender
`
`Figure 13 is markedlydifferent from its counterpart describing the current TCP/IP sender. Upon receiving
`a packet to send, £t_send would check the packet to determineif it is commoncase, andif so retrieve the
`appropriate headers from the cache. The only headers necessary to compute are the IP length and header
`checksum, and the TCP sequence number, acknowledgment number, window information, and checksum.
`The only additional overheadis incurred by calculating and setting the retransmission timer.
`In order for a TCP Client to send data it must invoke the usual system calls: socket, bind, and
`connect. During the connect call a new entry in the MuxTable would be created for this connection.
`To send data using the fast track the client would use the system call f£t_send. Because ft_send coexists
`with the original implementation of TCP, they must sharestate in order to perform reliably. This need for
`sharing state between the fast track and the original will result in a very hairy implementation.
`
`5.5 Fast Track TCP/IP Receiver Implementation Proposals
`Becauseof its guarantee of reliability, TCP has to maintain somestate while a connection exists between two
`sockets. On the receiving end, this includes the sequence numberof the next incoming packet, the number
`of packets received so far and still unacknowledged, and the receive and congestion window sizes (for
`congestion control). In addition to maintainingstate, the fast track receiver also has to inform the non-fast
`track, or slow track, TCP/IP receiver of the current connection state. This is where mostof the complexity
`lies. If all incoming TCP/IP packets were fast track, implementation would be much simplersince all the
`fast track receiver has to do is maintain its own data structures. However, in order to correctly process
`slow track packets, the slow track TCP/IP receiver must be informed of what has transpired during fast
`track processing. The bottomline is that both fast track and slow track TCP/IP receivers must be able to
`share state information for them to function correctly since their actions are based on the connection’s state.
`Several ways of implementing this are proposed in the following sections.
`
`13
`
`DEFS-ALA0007175
`Alacritech, Ex. 2032 Page 13
`
`Alacritech, Ex. 2032 Page 13
`
`

`

`
`
`Network
`Interface
`t
`
`Network
`Interface
`2
`
`a
`
`OS Software
`Interrapt
`Level
`
`Nelwork
`Interface
`8
`
`Network
`Interface
`Level
`
`Figure 13: Fast Track TCP/IP Sender
`
`5.5.1 State Information Sharing
`
`One waythat both fast track and slow track TCP/IP can share information is by sharing data structures.
`This would require a reimplementation of TCP/IP so that it can more readily share data structures with
`its fast track counterpart. This can be done by having a global data structure that contains the minimal
`information needed by the slow track TCP/IP to function correctly. This information includes the sequence
`numberof the next incoming packet, packets received but not yet acknowledged andthesizes of the receive
`and congestion windows. Each time fast track TCP/IP processes a packet, it updates this shared data
`structure. When slowtrack TCP/IP is reactivated to process a slow track packet,it first consults the shared
`data structure to update its state information and proceedsto process the incoming packet which includes
`updating the shared data structure. This facilitates switching between fast and slow tracks. To obtain
`further performance gains from fast track, we can defer the updating of items not used by fast track until a
`slow track packet comes in. Before switching to slow track, we would then update all the necessary fields
`to guarantee correctness. In addition to sharing data structures, we have to watch out for retransmission
`timers and RTT estimates. The issue of sharing retransmission timers between fast and slow track is an
`unresolved and will require careful design and thought.
`
`5.5.2 User-Level

This document is available on Docket Alarm but you must sign up to view it.


Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge
throbber

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

throbber

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

  • Up-to-date information for this case.
  • Email alerts whenever there is an update.
  • Full text search for other cases.
  • Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.


Access Government Site

We are redirecting you
to a mobile optimized page.





Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket