IPR2017-01728, No. 1022-22 Exhibit - CAVIUM 1022 Cooper, EC, et al, Protocol Implementation on the Nectar Communication Processor, School of Computer Science, Carnegie Mellon University, Sept 1990 (P.

Carnegie Mellon University
`Research Showcase @ CMU
`
`Computer Science Department
`
`School of Computer Science
`
`1990
`
`Protocol implementation on the Nectar
`communication processor
`
`Eric C. Cooper
`Carnegie Mellon University
`
`Follow this and additional works at: http://repository.cmu.edu/compsci
`
`This Technical Report is brought to you for free and open access by the School of Co1nputer Science at Research Showcase (<ll CMU. It has been
`accepted for inclusion in Co1nputer Science Depart1nent by an authorized adtninistrator of Research Showcase (<ll CMU. For 1nore infonnation, please
`contact research-showcase(a'.landrew.anu.edu.
`
`DEFS-ALA0010704
`
`

`NOTICE WARNING CONCERNING COPYRIGHT RESTRICTIONS:
`The copyright law of the United States (title 17, U.S. Code) governs the making
`of photocopies or other reproductions of copyrighted material. Any copying of this
`document without permission of its author may be prohibited by law.
`
`DEFS-ALA0010705
`
`

`Protocol Implementation on the
`Nectar Communication Processor
`Eric C. Cooper, Peter A. Steenkiste,
`Robert D. Sansom, and Brian D. Zill
`September 1990
`CMU-CS-90-153-·
`
`School of Computer Science
`Carnegie Mellon University
`Pittsburgh, PA 15213
`
`S!GCOMM '90 Symposium on Communications Architectures and Protocols
`Philadelphia, Pennsylvania
`September 24-27, 1990
`
`This research was sponsored by the Defense Advanced Research Projects Agency, lnfonnation Science and
`Technology Office. under the title "Research on Parallel Computing," ARPA Order No. 7330, issued by
`DARPA/CMO under Contract MDA972-90-C-0035.
`The views and conclusions contained in this document are those of the authors and should not be interpreted as
`representing the official policies, either expressed.or implied. of the U.S. Government
`
`DEFS-ALA0010706
`
`

`Keywords: protocol implementation, high-speed networks
`
`DEFS-ALA0010707
`
`

`Abstract
`
`We have built a high-speed local-area network called Nectar that uses programmable
`communication processors as host interfaces. In contrast to most protocol engines, our
`communication processors have a flexible runtime system that supports multiple transport
`protocols as well as application-specific activities. In particular, we have implemented the
`TCP/IP protocol suite and Nectar-specific communication protocols on the communication
`processor. The Nectar network current! y has 25 hosts and has been in use for over a year.
`The flexibility of our communication processor design does not compromise its perfor(cid:173)
`mance. The latency of a remote procedure call between application tasks executing on two
`Nectar hosts is less than 500 µsec. The same tasks can obtain a throughput of 28 Mbil/sec
`using either TCP/IP or Nectar-specific transport protocols. This throughput is limited by the
`VME bus that connects a host and its communication processor. Application tasks execut(cid:173)
`ing on two communication processors can obtain 90 Mbil/sec of the possible 100 Mbil/sec
`physical bandwidth using Nectar-specific transport protocols.
`
`CAtt?~:~r~~::: f·.'i;:J ... :.o:~ t~~,c~v~c~~;~TY
`P~T\~~1GU~~t,:-.}'-fJ PA
`152'iJa3S$,!l
`
`DEFS-ALA0010708
`
`

`1 Introduction
`
`The protocols used by hosts for network communication can be executed on the host
`processors or offloaded to separate communication processors. In Nectar, a high-speed
`local-area network, we have taken the latter approach. By offloading transport protocol
`processing from the hostto the communication processor, we reduce the burden on the host.
`Executing protocols on a communication processor is also attractive ifthe host is unsuited
`to protocol processing, as in the case of specialized architectures, or if the host operating
`system cannot easily be modified, as in the case of supercomputers.
`Unlike traditional network front-end processors, the Nectar communication processor
`has a general-purpose CPU and a flexible runtime system that support both transport protocol
`processing and application-specific tasks. Protocol implementations on the communication
`processor can be added or optimized with no change to the host system software; this
`is particularly advantageous in environments with heterogeneous hardware and operating
`systems. Application-specific communication tasks can be developed for either the host or
`the communication processor using the Nectarine programming interface.
`The interface between the communication processor and user processes on the host is
`based on shared memory. The buffer memory of the communication processor is directly
`accessible to user processes. No system calls or user-to-kernel copy operations are required
`to send and receive messages. As a result, host processes can communicate with lower
`latency (by a factor of 5) than would be possible using the UNIX socket interface of the
`host operating system [13].
`Related work on host interfaces for high-speed networks includes the VMP Network
`Adapter Board [10) and the Protocol Engine design [4). In these approaches, processing of
`specific transport protocols is offloaded to the network interface, but there is no provision
`for the execution of application-specific tasks or multiple transport protocols.
`The Nectar runtime system is similar in structure to other operating systems designed
`specifically to support network protocols, such as Swift [7) and the x-kemel [ 12). However
`the Nectar system is distinguished by its emphasis on the interface between the communi(cid:173)
`cation processor and the host.
`The Nectar communication processor together with its host can be viewed as a (het(cid:173)
`erogeneous) shared-memory multiprocessor. Dedicating one processor of a multiprocessor
`host to communication tasks can achieve some of the benefits of the Nectar approach, but
`this constrains the choice of host operating system and hardware. In contrast, the Nectar
`communication processor has been used with a variety of hosts and host operating systems.
`In this paper we describe and evaluate our approach to building a communication
`processor. We first give an overview of the Nectar hardware (Section 2). We then describe
`the design of the runtime system on the communication processor and the interactions
`between the runtime system and host processes. As an example of the use of the runtime
`system, we discuss our implementation of TCP/IP in Section 4. In Section 5, we discuss
`how the flexibility of the Nectar design allows different levels of communication functions
`to be offloaded from the host to the communication processor. A performance evaluation
`of the Nectar system is presented in Section 6.
`
`I
`
`DEFS-ALA0010709
`
`

`VME bus
`
`Fiber-optic links --
`
`Host
`
`CAB
`
`HUB
`
`HUB
`
`CAB
`
`Figure 1: Nectar system overview
`
`2 The Nectar System
`The Nectar system consists of a set of host computers connected in an arbitrary mesh via
`crossbar switches called HUBs (Figure 1). Each host uses a communication processor,
`called a CAB (Communication Accelerator Board), as its interface to the Nectar network.
`More details about the Nectar architecture can be found in an earlier paper [2].
`
`2.1 HUB Overview
`The Nectar network is built from fiber-optic links and one or more HUBs. A HUB consists of
`a crossbar switch, a set of J/O ports, and a controller. The controller implements commands
`that the CABs use to set up both packet-switching and circuit-switching connections over
`the network.
`Large Nectar systems are built using multiple HUBs. In such systems, some of the
`HUB J/0 ports are used to connect together HUBs. The CABs use source routing to send
`a message through the network. The HUB command set includes support for multi-hop
`connections and low-level flow control.
`In the current Nectar system, the fiber-optic lines operate at 100 Mbit/sec and the HUBs
`are 16 x 16 crossbars. The hardware latency to set up a connection and transfer the first
`byte of a packet through a single HUB is 700 nanoseconds.
`
`2.2 CAB Overview
`A block diagram of the CAB is shown in Figure 2. The heart of the CAB is a general-purpose
`RISC CPU. Two optical fibers-one for each direction-connect the CAB to an J/0 port
`on the HUB. The fibers are connected to FJFOs for temporary buffering of network data
`Cyclic Redundancy Checksums for incoming and outgoing data are computed by hardware.
`
`2
`
`DEFS-ALA0010710
`
`

`Data Memory Bus
`
`OMA
`Controller
`
`Fibers
`to HUB
`
`Fiber In
`
`Fiber Out
`
`Data
`Memory
`
`VME
`Interface
`
`VME
`
`to Host
`
`CPU Bus
`
`CPU
`
`Program
`Memory
`
`Registers
`and
`Devices
`
`Memory
`Protection
`
`Serial Une
`
`Figure 2: CAB block diagram
`
`The CAB communicates with the host through a VME interface, a common backplane in
`our environment.
`The CAB includes a hardware DMA controller that can manage simultaneous data
`transfers between the incoming and outgoing fibers and CAB memory, as well as between
`VME and CAB memory, leaving the CAB CPU free for protocol and application processing.
`The DMA controller also handles low-level flow control for network communication: it
`waits fordata to arrive if the input FIFO is empty, or for data to drain ifthe output FIFO is
`full.
`To provide the necessary memory bandwidth, the CAB memory is split into two regions:
`one intended for use as program memory, the other as data memory. DMA transfers are
`supported for data memory only; transfers to and from program memory must be performed
`by the CPU. The memory architecture is thus optimized for the expected usage pattern,
`although still allowing code to be executed from data memory or packets to be sent from
`program memory.
`Memory protection hardware on the CAB allows access permissions to be associated
`with each 1 Kbyte page. Multiple protection domains are provided, each with its own set of
`access permissions. Changing the protection domain is accomplished by reloading a single
`register.
`The current CAB implementation uses a SPARC processor running at 16.5 MHz. The
`program memory region contains 128 Kbytes of PROM and 512 Kbytes of RAM. The data
`memory region contains 1 Mbyte of RAM. Both memories are implemented using 35 nsec
`static RAM.
`
`3 Runtime System
`
`The CAB runtime system must support concurrent activities that include network interrupts,
`transport-protocol processing, and application-specific computation. A lightweight inter-
`
`3
`
`DEFS-ALA0010711
`
`

`I CAB
`I
`
`;
`
`l
`I
`~---'--~ .1.
`r···--·-·-··---1
`i
`I
`Hult-CAB
`J
`I~--~--,
`luler(am
`'
`!
`
`Applicaliom
`
`1
`
`HoaP~
`
`ApplicatiODll
`
`----· ____ J
`
`·
`
`I~~~-__,
`Dtulliik Protocol
`!...-------··--··-···-
`
`I
`
`Mailboxea
`
`&s,..,.
`
`I
`
`__ J
`
`r·---1~· :r-.:1···1
`j CAB Device Driver I
`···--···------.J
`·--·--···
`
`j
`
`I
`
`Figure 3: Nectar software architecture
`
`face between the host and the CAB is also essential; expensive host-CAB synchronization,
`data copying, and system calls must be avoided.
`Figure 3 shows the structure of the Nectar software on the host and CAB. The basic
`CAB runtime system provides support for multiprogramming (the threads package) and
`for buffering and synchronization (the mailbox and sync modules). Transport protocols
`(described in Section 4) are implemented on the CAB using these facilities. The Nectarine
`layer provides a consistent interface for applications on both the CAB and the host. The
`CAB device driver in the host operating system allows host processes to map CAB memory
`into their address spaces.
`
`3.1 Threads, Interrupts, and Upcalls
`Previous protocol implementations have demonstrated that multiple threads are useful,
`but multiple address spaces are unnecessary [7, 11, 12]. Since we expected most of the
`activities on the CAB to be protocol-related, we designed the CAB to provide a single
`physical address space, and the runtime system to support a single address space shared by
`multiple threads. The runtime system can use the multiple protection domains described in
`Section 2 to provide firewalls around application tasks if desired.
`The threads package for the CAB was derived from the Mach C Threads package [8]. It
`provides forking and joining of threads, mutual exclusion using locks, and synchronization
`by means of condition variables. Context switch time is determined by the cost of saving
`and restoring the SPARC register windows; 20 µsec is typical in the current implementation.
`System threads (such as those implementing network protocols) are typically driven
`by events such as a packet arriving or a condition being signaled; after a brief burst of
`processing, they relinquish the processor by waiting for the next event. We make no such
`assumptions about application threads: they may perform long computations with few
`synchronization points, or they may get stuck in infinite loops. Preemption of application
`threads is therefore necessary. The current scheduler uses a preemptive, priority-based
`scheme, with system threads running at a higher priority than application threads.
`Before we implemented preemptive scheduling of threads, upcalls [7] from interrupt
`handlers were the only way to provide sufficiently fast response to external events. For
`example, because of the speed at which an incoming packet fills the CAB input FIFO, a
`
`4
`
`DEFS-ALA0010712
`
`

`Host
`
`CAB
`
`CAB
`Device 1-1---jH
`- - -
`u...w...i.__
`Host Signal Queue
`
`@
`
`d
`
`CAB
`Interrupt
`Handler
`
`Figure 4: Host-CAB signaling
`
`start-of-packet interrupt must be handled within a few tens of microseconds. Waking up
`another thread has unacceptably long response time-the context switch would not occur
`until the currently running thread reached a synchronization point and relinquished the
`processor.
`Using upcalls from interrupt level means that data structures must be shared between
`threads and interrupt handlers, resulting in critical sections that must be protected by
`appropriate masking of interrupts. Disabling interrupts is less elegant than protecting
`critical sections by means of module-specific mutual exclusion locks because it violates
`modularity. The implementor of an abstraction must know whether its callers are threads
`or interrupt handlers so that interrupts can be masked appropriately.
`With preemption, a context switch occurs as soon as a higher-priority thread is awak(cid:173)
`ened. We therefore plan to revisit our decision to perform significant amounts of protocol
`processing at interrupt time. We will experiment with moving portions of it into high(cid:173)
`priority threads. Although this will introduce additional context switching, the CAB will
`spend less time with interrupts disabled, so overall performance is likely to improve.
`The response time could also be improved by using the SPARC's interrupt priority
`scheme to implement nested interrupts. Although appropriate use of nested interrupts
`could further reduce latency, the cost would be greater complexity and lack of modularity
`in the code because the implementor of an abstraction would have to be aware of the
`possible interrupt priority levels of users of the abstraction.
`
`3.2 Host-CAB Signaling
`
`Host processes and CAB threads interact using shared data structures that are mapped into
`the address spaces of the host processes. To manipulate data structures in CAB memory, a
`host process must be able to do the following:
`• Map CAB memory into its address space and translate between CAB physical ad(cid:173)
`dresses and host virtual addresses.
`
`• Wait for synchronization events on the CAB using either polling or blocking.
`
`• Notify CAB threads and host processes that an event has occurred.
`The CAB device driver in the host operaLing system enables host processes to map CAB
`memory into their address spaces (by using the rnmap system call). This mapping is done
`as part of program initialization.
`
`5
`
`DEFS-ALA0010713
`
`

`Host condition variables are used for host-CAB synchronization. Host condition vari(cid:173)
`ables are similar to the condition variables in the threads package on the CAB, except that
`the waiting entities are host processes instead of CAB threads. Host condition variables
`are located in CAB memory where they can be accessed by both CAB threads and host
`processes.
`Signal and Wait are the main operations on host conditions. Signal increments
`a poll value in the host condition. Wait repeatedly tests the poll value, and returns when
`the poll value changes. Both CAB threads and host processes can signal a host condition.
`Using polling, host processes can wait for host conditions without incurring the overhead
`of a system call.
`In many situations, for example a server process waiting for a request, polling is
`inappropriate because it wastes host CPU cycles. Thus we also allow host processes to
`Wait for host conditions without polling, by calling the CAB device driver. The CAB
`driver records that the process is interested in the specified host condition and puts the
`process to sleep. When a host condition variable is signaled, its address is placed in the host
`signal queue (Figure 4), and the host is interrupted. The CAB driver handles the interrupt
`and uses the information in the queue to wake up the processes that are waiting for the host
`condition.
`The host signal queue has fixed-size elements that consist of an opcode and a parameter.
`Tiris queue can also be used by the CAB for other kinds of requests to the host, such as
`invocation of host 1/0 and debugging facilities.
`Host processes wake up CAB threads by placing a request in the CAB signal queue
`(Figure 4) and interrupting the CAB. As with the host signal queue, the CAB signal queue
`is also used to pass other types of requests to the CAB.
`The CAB signaling mechanism is extended into a simple host-to-CAB RPC facility by
`allowing the CAB to return a result to the host. The sync abstraction described in Section 3.4
`provides the necessary synchronization and transfer of data.
`
`3.3 Mailboxes
`A mailbox is a queue of messages with a network-wide address. The buffer space used for
`the messages associated with a mailbox is allocated in CAB memory. By mapping CAB
`memory into their address spaces, host processes can build and consume messages in place.
`Mailboxes also provide synchronization between readers and writers. A host process
`or a CAB thread blocks when it tries to read a message from an empty mailbox; it resumes
`when a message has been placed in the mailbox, typically by a transport protocol on the
`CAB.
`These features make mailboxes attractive for communication between the host and the
`CAB. A host process can invoke a service on the CAB by placing a request in a server
`mailbox; this wakes up the server which processes the request and places the result in a
`reply mailbox, where it can be read by the host process. Similarly, a CAB thread can invoke
`a service on the host by placing a request in a mailbox that is read by a host process.
`Network-wide addressing of mailboxes enables host processes or CAB threads to send
`messages to remote mailboxes via transport protocols. In this way, remote services can be
`invoked from anywhere in the Nectar network.
`
`6
`
`DEFS-ALA0010714
`
`

`nonexistent
`
`End_Gat
`
`being
`read
`
`being
`written
`
`End Put
`
`Beg.1n_Gat
`
`available
`for reading
`
`Figure 5: Mailbox operations and message states
`
`The Mailbox Interface
`A two-phase scheme is used for both reading and writing messages in mailboxes. This
`allows messages to be produced or consumed in place without further copying. Figure 5
`depicts the state transitions that a message undergoes as a result of the mailbox operations.
`To write a message, a program first calls Begin_Put, specifying the mailbox and the
`size of the message. This returns a pointer to a newly allocated data area of the required size.
`The writer can now fill in the contents of the message; space for additional messages may
`be reserved in the meantime using additional Begin_Put calls. When the program has
`finished writing the message, it uses End _Put to make the message available to readers.
`A reader calls Begin_ Get to obtain a pointer to the next available message, allowing
`the data to be read in place. When the reader is finished with the data, End_ Get releases the
`storage associated with it. Multiple threads can use these operations to process concurrently
`the messages arriving at a single mailbox.
`Some applications (such as IP, described in Section 4.1) need the ability to move a
`message from one mailbox to another. The operations described so far would allow this
`functionality, but at the cost of copying data between the different data areas associated
`with the two mailboxes. To avoid this overhead, we introduced an Enqueue operation that
`moves the message without copying the data. We also provided operations to "adjust" the
`size of messages in place, effectively removing a prefix or suffix of the message without
`doing any copying.
`Both Begin_Put and Begin_ Get block if no space or message is available. The
`calling thread is rescheduled when space becomes available or a message arrives. Interrupt
`handlers use non-blocking versions of these calls.
`The mailbox interface also allows a reader upcall to be attached to a mailbox. The
`upcall is invoked as a side effect of the End _Put operation. This flexibility allows us to
`
`7
`
`DEFS-ALA0010715
`
`

`trade the concurrency of multiple threads against the overhead of context switching. For
`example, if a pair of threads uses a mailbox in a client-server style, the body of the server
`thread can instead be attached to the mailbox as a reader upcall; this effectively converts a
`cross-thread procedure call into a local one.
`
`Implementation of Mailboxes
`Mailboxes are implemented as queues of messages waiting to be read; buffer space for
`messages is allocated from a common heap. Allocating buffers from the heap provides
`better utilization of the CAB data memory since it is shared among all mailboxes on the
`CAB. As an optimization, each mailbox caches a small buffer; this avoids the cost of heap
`allocation and deallocation when sending small messages. The queue representation also
`allows us to implement the Enqueue operation by simply moving pointers.
`Mailbox operations from the host were initially implemented using the simple host-to(cid:173)
`CAB RPC mechanism described in Section 3.2. We also implemented a shared memory
`version in which mailbox data structures are updated directly from the hosL Since the
`reader and writer data structures are separate, mutual exclusion between CAB threads and
`host processes can be avoided as long as the readers either all reside on the CAB or all
`reside on the host, and the same for writers. This is certainly true in the common case of
`a single reader and a single writer, and also, more generally, for client-server interfaces
`across the host-CAB boundary.
`In return for the restrictions on placement of readers and writers, the shared memory
`implementation provides about a factor of two improvement over the RPC-based implemen(cid:173)
`tation for Sun 4 hosts. We have configured the runtime system so that both implementations
`coexist, and the appropriate implementation can be selected dynamically on a per-mailbox
`basis.
`
`3.4 Lightweight Synchronization
`Synchronization between two threads or processes does not always need the full generality
`of mailboxes. For example, returning a status value from a transport protocol on the CAB
`to a sender on the host could be done using a mailbox, but all that is really needed is a
`condition variable and a shared word for the value. Syncs allow a user to return a one-word
`value to an asynchronous reader efficiently; they are similar to Reppy's events [14].
`
`The Sync Interface
`The operations on syncs are Allee, Write, Read, and Cancel. Allee allocates a
`new sync. Write places a one-word value in the sync data structure, and marks the sync
`as written. Read blocks until a sync has been written, then frees the sync and returns its
`value. Alternatively, the reader can use Cancel to indicate that it is longer interested in
`the sync. Cancel frees the sync if it has been written; otherwise Cancel just marks the
`sync as canceled, leaving it to be freed as part of a subsequent Write.
`
`Implementation of Syncs
`Host processes and CAB threads allocate syncs in CAB memory; conflicts are avoided by
`using two separate pools of syncs. Since there is only one reader, reading a sync does not
`require any locking. Writing a sync does require a critical section: checking whether the
`
`8
`
`DEFS-ALA0010716
`
`

`sync has already been canceled and marking the sync as written must be done atomically.
`On the CAB this is done by masking interrupts. A host process offloads the execution
`of Write to the CAB using the CAB signaling mechanism. Cancel is implemented
`similarly.
`
`3.5 The Application Interface: Nectarine
`Most of the current Nectar applications are written using Nectarine, the Nectar Interface.
`Nectarine is implemented as a library linked into an application's address space. It provides
`applications with a procedural interface to the Nectar communication protocols and direct
`access to mailboxes in CAB memory. It also allows applications to create mailboxes and
`tasks on other hosts or CABs. Nectarine simplifies the task of writing Nectar applications
`by hiding the details of the host-CAB interface and presenting the same interface on both
`the CAB and host.
`
`4 Protocol Implementation
`
`We have implemented several transport protocols on the CAB, including TCP/IP and a
`set of Nectar-specific transport protocols. The Nectar-specific protocols provide datagram,
`reliable message, and request-response communication. The reliable message protocol is
`a simple stop-and-wait protocol, and the request-response protocol provides the transport
`mechanism for client-server RPC calls.
`The implementation of the TCP/IP protocol suite serves as a good example of the use of
`the runtime system's features. Time-critical functions are performed by interrupt handlers
`and mailbox upcalls, most others by system threads. Mailboxes are used throughout for
`the management of data areas. The use of mailboxes proved advantageous in avoiding
`any copying of the data between receipt and presentation to the user.
`Although we
`only describe the implementation of TCP/IP, all the transport protocol implementations are
`structured in a similar fashion.
`
`4.1
`
`Internet Protocol
`IP input processing is performed at interrupt time. When a packet arrives over the fiber,
`the datalink layer reads the datalink header and initiates DMA operations to place the data
`into an appropriate mailbox. For IP packets, this is al ways the IP input mailbox. After the
`entire protocol header arrives, the datalink layer issues a start-of-data upcall to the protocol
`so that useful work can be done while the remainder of the packet is being received into
`the mailbox. IP uses this opportunity to perform a sanity check of the IP header (including
`computation of the IP header checksum).
`When the entire packet has been received, the datalink layer issues an end-of-data
`upcall.
`In this upcall, the IP input handler queues packets for reassembly if they are
`fragments of a larger datagram. The handler transfers complete datagrams to the input
`mailbox of the appropriate higher-level protocol. This transfer uses the mailbox Enqueue
`operation, so no data is copied.
`Higher-level protocols (including ICMP) are required to provide an input mailbox to
`IP; this mailbox constitutes the entire receive interface between IP and higher protocols.
`One advantage to this interface is that it allows the higher protocols to be implemented
`either as mailbox upcalls, which are called whenever their input mailbox is written, or as
`
`9
`
`DEFS-ALA0010717
`
`

`separate threads, which block until the next packet anives. In our current system, ICMP is
`implemented as a mailbox upcall, while UDP and TCP each have their own server threads.
`While the receive interface between IP and higher protocols consists of a simple mailbox,
`the send interface is more complex. To send a packet, higher protocols are expected to
`call IP_Output with a header template, a reference to the data they wish to send, a flag
`indicating whether the data area should be freed once sent, and a route to the destination (if
`known). The header template must contain a partially filled-in IP header. Protocols may
`also append theirown header to the end of the tern plate. IP_ Output fills in the remaining
`fields in the IP header and calls the datalink layer to transmit the packet.
`
`4.2 Transmission Control Protocol
`The Nectar TCP implementation runs almost entirely in system threads, rather than at
`interrupt time. This allows shared data structures to be protected with mutual exclusion
`locks rather than by disabling interrupts. We plan to compare this approach to a strictly
`interrupt-driven implementation of TCP as part of the experiment discussed in Section 3.1.
`All TCP input processing is performed by the TCP input thread. This thread blocks on
`a Begin_ Get until a packet anives. Once it gets a packet, it examines the TCP header,
`checksums the entire packet, and performs standard TCP input processing. To pass data
`to the user, TCP simply deletes the headers and transfers the packet to the user's receive
`mailbox using the Enqueue operation.
`A user wishing to send data on an established TCP connection places a request in the
`TCP send-request mailbox. The data to be sent may be placed in the send-request mailbox
`following the request, or it may already exist in some other mailbox, in which case the
`user includes a pointer to it in the request. The TCP send thread on the CAB services this
`request by placing the data on the send queue of the appropriate connection and calling the
`TCP output routine. CAB-resident senders can do this directly without involving the TCP
`send thread.
`
`S Usage
`The flexibility of the CAB software architecture allows us to choose which layers in the
`protocol stack are handled by the CAB, effectively changing the interface the CAB presents
`to the host. Three such interfaces are described below, ordered in increasing degree of CAB
`functionality.
`
`5.1 Network Device
`The Nectar network can be used as a conventional, high-speed LAN by treating the CAB
`as a network device and enhancing the CAB device driver to act as a network interface. We
`have implemented a driver at this level for the Berkeley networking code [11], performing
`IP and higher-level protocols on the host as usual. The advantage of this approach is binary
`compatibility: all the familiar network services are immediately available.
`To perform networking functions, the device driver cooperates with a server thread on
`the CAB that is responsible for transmitting and receiving packets over Nectar. The driver
`and the server share a pool of buffers: to send a packet the driver writes the packet into a
`free buffer in the output pool and notifies the server that the packet should be sent; when
`
`10
`
`DEFS-ALA0010718
`
`

`a packet is received the server finds a free input buffer, receives the packet into the buffer,
`and informs the driver of the packet's arrival.
`
`5.2 Protocol Engine
`The CAB can be used as a protocol engine by offloading transport protocol processing to
`the CAB. Several interfaces are possible on the host; these interfaces are independent of
`the particular transport protocols implemented on the CAB.
`The Nectarine interface that was described in Section 3.5 provides applications with a
`flexible communication model. Since it uses the host-CAB buffering and synchronization
`facilities directly, some or all of CAB memory must be mapped into applications' address
`spaces.
`The familiar Berlceley socket interface [13] is also being implemented at this level.
`Initially, an emulation library will be provided for applications that can be re-linked.
`Eventually, we will move this support into the UNIX kernel, which will intercept operations
`on Nectar connections and dispatch them to the CAB. This approach incurs the cost of
`system calls, but allows binary compatibility.
`Work is also in progress to support the Mach interprocess communication interface [I].
`Network !PC in Mach is provided by a message-forwarding server external to the Mach
`kernel; this server is a natural candidate for execution on the CAB.
`
`5.3 Application-level Communication Engine
`The Nectar CAB and its runtime system are more flexible than many proposed protocol
`engines since application-specific code can be executed on the CAB. Distributed applica(cid:173)
`tions on Nectar often perform tasks on both hosts and CABs, effec

This document is available on Docket Alarm but you must sign up to view it.

Or .

Accessing this document will incur an additional charge of $.

After purchase, you can access this document again without charge.

Accept $ Charge

Still Working On It

This document is taking longer than usual to download. This can happen if we need to contact the court directly to obtain the document and their servers are running slowly.

Give it another minute or two to complete, and then try the refresh button.

A few More Minutes ... Still Working

It can take up to 5 minutes for us to download a document if the court servers are running slowly.

Thank you for your continued patience.

This document could not be displayed.

We could not find this document within its docket. Please go back to the docket page and check the link. If that does not work, go back to the docket and refresh it to pull the newest information.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

Set your membership status to view this document.

With a Docket Alarm membership, you'll get a whole lot more, including:

Up-to-date information for this case.
Email alerts whenever there is an update.
Full text search for other cases.
Get email alerts whenever a new case matches your search.

Become a Member

One Moment Please

The filing “” is large (MB) and is being downloaded.

Please refresh this page in a few minutes to see if the filing has been downloaded. The filing will also be emailed to you when the download completes.

Your document is on its way!

If you do not receive the document in five minutes, contact support at support@docketalarm.com.

Sealed Document

We are unable to display this document, it may be under a court ordered seal.

If you have proper credentials to access the file, you may proceed directly to the court's system using your government issued username and password.

Access Government Site

We are redirecting you
to a mobile optimized page.

Document Unreadable or Corrupt

Refresh this Document
Go to the Docket

We are unable to display this document.

Refresh this Document
Go to the Docket

Supplemental Search

Search for PTAB Motions

PTAB Analytics

TTAB Analytics

Basic Search

Filters

Party Search

Advanced

Selected Courts

Recently Selected Courts

Find PTAB Decisions

PTAB Analytics

Special PTAB Alerts

Orange Book

Directly Search Federal Courts

Search Trademark ...

This document is available on Docket Alarm but you must sign up to view it.

Accessing this document will incur an additional charge of $.

Still Working On It

A few More Minutes ... Still Working

This document could not be displayed.

Your account does not support viewing this document.

You need a Paid Account to view this document. Click here to change your account type.

Your account does not support viewing this document.

One Moment Please

Your document is on its way!

Sealed Document

We are redirecting youto a mobile optimized page.

Document Unreadable or Corrupt

We are unable to display this document.

STEP 2 of 2

Choose your membership type

Flat-Fee

Pay-As-You-Go

Add your payment information

Login or Join

Enter your corporate Email

Thousands of your peers are saving time and gaining a competitive advantage with Docket Alarm.

Join Docket Alarm to perform smarter legal research.

Download this document and millions of others instantly with a Docket Alarm membership.

Join Docket Alarm and start performing smarter legal research.

Start tracking this docket instantly with a Docket Alarm membership.

Join thousands of your peers and start performing smarter legal research.

STEP 1 of 2

Millions of Documents | 15 Seconds to Signup

Hi !

Welcome to Docket Alarm

Welcome to Docket Alarm!

Explore Litigation Insights andManage Your Cases

Reset Password

What is PACER?

Why do I need it?

What will I be charged?

Do other courts have fees?

Basic Free Access

Welcome

Thank you

Check Firm Account

We are redirecting you
to a mobile optimized page.

Explore Litigation Insights and
Manage Your Cases