`Research Showcase @ CMU
`
`Computer Science Department
`
`School of Computer Science
`
`1990
`
`Protocol implementation on the Nectar
`communication processor
`
`Eric C. Cooper
`Carnegie Mellon University
`
`Follow this and additional works at: http://repository.cmu.edu/compsci
`
`This Technical Report is brought to you for free and open access by the School of Co1nputer Science at Research Showcase (<ll CMU. It has been
`accepted for inclusion in Co1nputer Science Depart1nent by an authorized adtninistrator of Research Showcase (<ll CMU. For 1nore infonnation, please
`contact research-showcase(a'.landrew.anu.edu.
`
`DEFS-ALA0010704
`Ex.1022.001
`
`DELL
`
`
`
`NOTICE WARNING CONCERNING COPYRIGHT RESTRICTIONS:
`The copyright law of the United States (title 17, U.S. Code) governs the making
`of photocopies or other reproductions of copyrighted material. Any copying of this
`document without permission of its author may be prohibited by law.
`
`DEFS-ALA0010705
`Ex.1022.002
`
`DELL
`
`
`
`Protocol Implementation on the
`Nectar Communication Processor
`Eric C. Cooper, Peter A. Steenkiste,
`Robert D. Sansom, and Brian D. Zill
`September 1990
`CMU-CS-90-153-·
`
`School of Computer Science
`Carnegie Mellon University
`Pittsburgh, PA 15213
`
`S!GCOMM '90 Symposium on Communications Architectures and Protocols
`Philadelphia, Pennsylvania
`September 24-27, 1990
`
`This research was sponsored by the Defense Advanced Research Projects Agency, lnfonnation Science and
`Technology Office. under the title "Research on Parallel Computing," ARPA Order No. 7330, issued by
`DARPA/CMO under Contract MDA972-90-C-0035.
`The views and conclusions contained in this document are those of the authors and should not be interpreted as
`representing the official policies, either expressed.or implied. of the U.S. Government
`
`DEFS-ALA0010706
`Ex.1022.003
`
`DELL
`
`
`
`Keywords: protocol implementation, high-speed networks
`
`DEFS-ALA0010707
`Ex.1022.004
`
`DELL
`
`
`
`Abstract
`
`We have built a high-speed local-area network called Nectar that uses programmable
`communication processors as host interfaces. In contrast to most protocol engines, our
`communication processors have a flexible runtime system that supports multiple transport
`protocols as well as application-specific activities. In particular, we have implemented the
`TCP/IP protocol suite and Nectar-specific communication protocols on the communication
`processor. The Nectar network current! y has 25 hosts and has been in use for over a year.
`The flexibility of our communication processor design does not compromise its perfor(cid:173)
`mance. The latency of a remote procedure call between application tasks executing on two
`Nectar hosts is less than 500 µsec. The same tasks can obtain a throughput of 28 Mbil/sec
`using either TCP/IP or Nectar-specific transport protocols. This throughput is limited by the
`VME bus that connects a host and its communication processor. Application tasks execut(cid:173)
`ing on two communication processors can obtain 90 Mbil/sec of the possible 100 Mbil/sec
`physical bandwidth using Nectar-specific transport protocols.
`
`CAtt?~:~r~~::: f·.'i;:J ... :.o:~ t~~,c~v~c~~;~TY
`P~T\~~1GU~~t,:-.}'-fJ PA
`152'iJa3S$,!l
`
`DEFS-ALA0010708
`Ex.1022.005
`
`DELL
`
`
`
`1 Introduction
`
`The protocols used by hosts for network communication can be executed on the host
`processors or offloaded to separate communication processors. In Nectar, a high-speed
`local-area network, we have taken the latter approach. By offloading transport protocol
`processing from the hostto the communication processor, we reduce the burden on the host.
`Executing protocols on a communication processor is also attractive ifthe host is unsuited
`to protocol processing, as in the case of specialized architectures, or if the host operating
`system cannot easily be modified, as in the case of supercomputers.
`Unlike traditional network front-end processors, the Nectar communication processor
`has a general-purpose CPU and a flexible runtime system that support both transport protocol
`processing and application-specific tasks. Protocol implementations on the communication
`processor can be added or optimized with no change to the host system software; this
`is particularly advantageous in environments with heterogeneous hardware and operating
`systems. Application-specific communication tasks can be developed for either the host or
`the communication processor using the Nectarine programming interface.
`The interface between the communication processor and user processes on the host is
`based on shared memory. The buffer memory of the communication processor is directly
`accessible to user processes. No system calls or user-to-kernel copy operations are required
`to send and receive messages. As a result, host processes can communicate with lower
`latency (by a factor of 5) than would be possible using the UNIX socket interface of the
`host operating system [13].
`Related work on host interfaces for high-speed networks includes the VMP Network
`Adapter Board [10) and the Protocol Engine design [4). In these approaches, processing of
`specific transport protocols is offloaded to the network interface, but there is no provision
`for the execution of application-specific tasks or multiple transport protocols.
`The Nectar runtime system is similar in structure to other operating systems designed
`specifically to support network protocols, such as Swift [7) and the x-kemel [ 12). However
`the Nectar system is distinguished by its emphasis on the interface between the communi(cid:173)
`cation processor and the host.
`The Nectar communication processor together with its host can be viewed as a (het(cid:173)
`erogeneous) shared-memory multiprocessor. Dedicating one processor of a multiprocessor
`host to communication tasks can achieve some of the benefits of the Nectar approach, but
`this constrains the choice of host operating system and hardware. In contrast, the Nectar
`communication processor has been used with a variety of hosts and host operating systems.
`In this paper we describe and evaluate our approach to building a communication
`processor. We first give an overview of the Nectar hardware (Section 2). We then describe
`the design of the runtime system on the communication processor and the interactions
`between the runtime system and host processes. As an example of the use of the runtime
`system, we discuss our implementation of TCP/IP in Section 4. In Section 5, we discuss
`how the flexibility of the Nectar design allows different levels of communication functions
`to be offloaded from the host to the communication processor. A performance evaluation
`of the Nectar system is presented in Section 6.
`
`I
`
`DEFS-ALA0010709
`Ex.1022.006
`
`DELL
`
`
`
`VME bus
`
`Fiber-optic links --
`
`Host
`
`CAB
`
`HUB
`
`HUB
`
`CAB
`
`Figure 1: Nectar system overview
`
`2 The Nectar System
`The Nectar system consists of a set of host computers connected in an arbitrary mesh via
`crossbar switches called HUBs (Figure 1). Each host uses a communication processor,
`called a CAB (Communication Accelerator Board), as its interface to the Nectar network.
`More details about the Nectar architecture can be found in an earlier paper [2].
`
`2.1 HUB Overview
`The Nectar network is built from fiber-optic links and one or more HUBs. A HUB consists of
`a crossbar switch, a set of J/O ports, and a controller. The controller implements commands
`that the CABs use to set up both packet-switching and circuit-switching connections over
`the network.
`Large Nectar systems are built using multiple HUBs. In such systems, some of the
`HUB J/0 ports are used to connect together HUBs. The CABs use source routing to send
`a message through the network. The HUB command set includes support for multi-hop
`connections and low-level flow control.
`In the current Nectar system, the fiber-optic lines operate at 100 Mbit/sec and the HUBs
`are 16 x 16 crossbars. The hardware latency to set up a connection and transfer the first
`byte of a packet through a single HUB is 700 nanoseconds.
`
`2.2 CAB Overview
`A block diagram of the CAB is shown in Figure 2. The heart of the CAB is a general-purpose
`RISC CPU. Two optical fibers-one for each direction-connect the CAB to an J/0 port
`on the HUB. The fibers are connected to FJFOs for temporary buffering of network data
`Cyclic Redundancy Checksums for incoming and outgoing data are computed by hardware.
`
`2
`
`DEFS-ALA0010710
`Ex.1022.007
`
`DELL
`
`
`
`Data Memory Bus
`
`OMA
`Controller
`
`Fibers
`to HUB
`
`Fiber In
`
`Fiber Out
`
`Data
`Memory
`
`VME
`Interface
`
`VME
`
`to Host
`
`CPU Bus
`
`CPU
`
`Program
`Memory
`
`Registers
`and
`Devices
`
`Memory
`Protection
`
`Serial Une
`
`Figure 2: CAB block diagram
`
`The CAB communicates with the host through a VME interface, a common backplane in
`our environment.
`The CAB includes a hardware DMA controller that can manage simultaneous data
`transfers between the incoming and outgoing fibers and CAB memory, as well as between
`VME and CAB memory, leaving the CAB CPU free for protocol and application processing.
`The DMA controller also handles low-level flow control for network communication: it
`waits fordata to arrive if the input FIFO is empty, or for data to drain ifthe output FIFO is
`full.
`To provide the necessary memory bandwidth, the CAB memory is split into two regions:
`one intended for use as program memory, the other as data memory. DMA transfers are
`supported for data memory only; transfers to and from program memory must be performed
`by the CPU. The memory architecture is thus optimized for the expected usage pattern,
`although still allowing code to be executed from data memory or packets to be sent from
`program memory.
`Memory protection hardware on the CAB allows access permissions to be associated
`with each 1 Kbyte page. Multiple protection domains are provided, each with its own set of
`access permissions. Changing the protection domain is accomplished by reloading a single
`register.
`The current CAB implementation uses a SPARC processor running at 16.5 MHz. The
`program memory region contains 128 Kbytes of PROM and 512 Kbytes of RAM. The data
`memory region contains 1 Mbyte of RAM. Both memories are implemented using 35 nsec
`static RAM.
`
`3 Runtime System
`
`The CAB runtime system must support concurrent activities that include network interrupts,
`transport-protocol processing, and application-specific computation. A lightweight inter-
`
`3
`
`DEFS-ALA0010711
`Ex.1022.008
`
`DELL
`
`
`
`I CAB
`I
`
`;
`
`l
`I
`~---'--~ .1.
`r···--·-·-··---1
`i
`I
`Hult-CAB
`J
`I~--~--,
`luler(am
`'
`!
`
`Applicaliom
`
`1
`
`HoaP~
`
`ApplicatiODll
`
`----· ____ J
`
`·
`
`I~~~-__,
`Dtulliik Protocol
`!...-------··--··-···-
`
`I
`
`Mailboxea
`
`&s,..,.
`
`I
`
`__ J
`
`r·---1~· :r-.:1···1
`j CAB Device Driver I
`···--···------.J
`·--·--···
`
`j
`
`I
`
`Figure 3: Nectar software architecture
`
`face between the host and the CAB is also essential; expensive host-CAB synchronization,
`data copying, and system calls must be avoided.
`Figure 3 shows the structure of the Nectar software on the host and CAB. The basic
`CAB runtime system provides support for multiprogramming (the threads package) and
`for buffering and synchronization (the mailbox and sync modules). Transport protocols
`(described in Section 4) are implemented on the CAB using these facilities. The Nectarine
`layer provides a consistent interface for applications on both the CAB and the host. The
`CAB device driver in the host operating system allows host processes to map CAB memory
`into their address spaces.
`
`3.1 Threads, Interrupts, and Upcalls
`Previous protocol implementations have demonstrated that multiple threads are useful,
`but multiple address spaces are unnecessary [7, 11, 12]. Since we expected most of the
`activities on the CAB to be protocol-related, we designed the CAB to provide a single
`physical address space, and the runtime system to support a single address space shared by
`multiple threads. The runtime system can use the multiple protection domains described in
`Section 2 to provide firewalls around application tasks if desired.
`The threads package for the CAB was derived from the Mach C Threads package [8]. It
`provides forking and joining of threads, mutual exclusion using locks, and synchronization
`by means of condition variables. Context switch time is determined by the cost of saving
`and restoring the SPARC register windows; 20 µsec is typical in the current implementation.
`System threads (such as those implementing network protocols) are typically driven
`by events such as a packet arriving or a condition being signaled; after a brief burst of
`processing, they relinquish the processor by waiting for the next event. We make no such
`assumptions about application threads: they may perform long computations with few
`synchronization points, or they may get stuck in infinite loops. Preemption of application
`threads is therefore necessary. The current scheduler uses a preemptive, priority-based
`scheme, with system threads running at a higher priority than application threads.
`Before we implemented preemptive scheduling of threads, upcalls [7] from interrupt
`handlers were the only way to provide sufficiently fast response to external events. For
`example, because of the speed at which an incoming packet fills the CAB input FIFO, a
`
`4
`
`DEFS-ALA0010712
`Ex.1022.009
`
`DELL
`
`
`
`Host
`
`CAB
`
`CAB
`Device 1-1---jH
`- - -
`u...w...i.__
`Host Signal Queue
`
`@
`
`d
`
`CAB
`Interrupt
`Handler
`
`Figure 4: Host-CAB signaling
`
`start-of-packet interrupt must be handled within a few tens of microseconds. Waking up
`another thread has unacceptably long response time-the context switch would not occur
`until the currently running thread reached a synchronization point and relinquished the
`processor.
`Using upcalls from interrupt level means that data structures must be shared between
`threads and interrupt handlers, resulting in critical sections that must be protected by
`appropriate masking of interrupts. Disabling interrupts is less elegant than protecting
`critical sections by means of module-specific mutual exclusion locks because it violates
`modularity. The implementor of an abstraction must know whether its callers are threads
`or interrupt handlers so that interrupts can be masked appropriately.
`With preemption, a context switch occurs as soon as a higher-priority thread is awak(cid:173)
`ened. We therefore plan to revisit our decision to perform significant amounts of protocol
`processing at interrupt time. We will experiment with moving portions of it into high(cid:173)
`priority threads. Although this will introduce additional context switching, the CAB will
`spend less time with interrupts disabled, so overall performance is likely to improve.
`The response time could also be improved by using the SPARC's interrupt priority
`scheme to implement nested interrupts. Although appropriate use of nested interrupts
`could further reduce latency, the cost would be greater complexity and lack of modularity
`in the code because the implementor of an abstraction would have to be aware of the
`possible interrupt priority levels of users of the abstraction.
`
`3.2 Host-CAB Signaling
`
`Host processes and CAB threads interact using shared data structures that are mapped into
`the address spaces of the host processes. To manipulate data structures in CAB memory, a
`host process must be able to do the following:
`• Map CAB memory into its address space and translate between CAB physical ad(cid:173)
`dresses and host virtual addresses.
`
`• Wait for synchronization events on the CAB using either polling or blocking.
`
`• Notify CAB threads and host processes that an event has occurred.
`The CAB device driver in the host operaLing system enables host processes to map CAB
`memory into their address spaces (by using the rnmap system call). This mapping is done
`as part of program initialization.
`
`5
`
`DEFS-ALA0010713
`Ex.1022.010
`
`DELL
`
`
`
`Host condition variables are used for host-CAB synchronization. Host condition vari(cid:173)
`ables are similar to the condition variables in the threads package on the CAB, except that
`the waiting entities are host processes instead of CAB threads. Host condition variables
`are located in CAB memory where they can be accessed by both CAB threads and host
`processes.
`Signal and Wait are the main operations on host conditions. Signal increments
`a poll value in the host condition. Wait repeatedly tests the poll value, and returns when
`the poll value changes. Both CAB threads and host processes can signal a host condition.
`Using polling, host processes can wait for host conditions without incurring the overhead
`of a system call.
`In many situations, for example a server process waiting for a request, polling is
`inappropriate because it wastes host CPU cycles. Thus we also allow host processes to
`Wait for host conditions without polling, by calling the CAB device driver. The CAB
`driver records that the process is interested in the specified host condition and puts the
`process to sleep. When a host condition variable is signaled, its address is placed in the host
`signal queue (Figure 4), and the host is interrupted. The CAB driver handles the interrupt
`and uses the information in the queue to wake up the processes that are waiting for the host
`condition.
`The host signal queue has fixed-size elements that consist of an opcode and a parameter.
`Tiris queue can also be used by the CAB for other kinds of requests to the host, such as
`invocation of host 1/0 and debugging facilities.
`Host processes wake up CAB threads by placing a request in the CAB signal queue
`(Figure 4) and interrupting the CAB. As with the host signal queue, the CAB signal queue
`is also used to pass other types of requests to the CAB.
`The CAB signaling mechanism is extended into a simple host-to-CAB RPC facility by
`allowing the CAB to return a result to the host. The sync abstraction described in Section 3.4
`provides the necessary synchronization and transfer of data.
`
`3.3 Mailboxes
`A mailbox is a queue of messages with a network-wide address. The buffer space used for
`the messages associated with a mailbox is allocated in CAB memory. By mapping CAB
`memory into their address spaces, host processes can build and consume messages in place.
`Mailboxes also provide synchronization between readers and writers. A host process
`or a CAB thread blocks when it tries to read a message from an empty mailbox; it resumes
`when a message has been placed in the mailbox, typically by a transport protocol on the
`CAB.
`These features make mailboxes attractive for communication between the host and the
`CAB. A host process can invoke a service on the CAB by placing a request in a server
`mailbox; this wakes up the server which processes the request and places the result in a
`reply mailbox, where it can be read by the host process. Similarly, a CAB thread can invoke
`a service on the host by placing a request in a mailbox that is read by a host process.
`Network-wide addressing of mailboxes enables host processes or CAB threads to send
`messages to remote mailboxes via transport protocols. In this way, remote services can be
`invoked from anywhere in the Nectar network.
`
`6
`
`DEFS-ALA0010714
`Ex.1022.011
`
`DELL
`
`
`
`nonexistent
`
`End_Gat
`
`being
`read
`
`being
`written
`
`End Put
`
`Beg.1n_Gat
`
`available
`for reading
`
`Figure 5: Mailbox operations and message states
`
`The Mailbox Interface
`A two-phase scheme is used for both reading and writing messages in mailboxes. This
`allows messages to be produced or consumed in place without further copying. Figure 5
`depicts the state transitions that a message undergoes as a result of the mailbox operations.
`To write a message, a program first calls Begin_Put, specifying the mailbox and the
`size of the message. This returns a pointer to a newly allocated data area of the required size.
`The writer can now fill in the contents of the message; space for additional messages may
`be reserved in the meantime using additional Begin_Put calls. When the program has
`finished writing the message, it uses End _Put to make the message available to readers.
`A reader calls Begin_ Get to obtain a pointer to the next available message, allowing
`the data to be read in place. When the reader is finished with the data, End_ Get releases the
`storage associated with it. Multiple threads can use these operations to process concurrently
`the messages arriving at a single mailbox.
`Some applications (such as IP, described in Section 4.1) need the ability to move a
`message from one mailbox to another. The operations described so far would allow this
`functionality, but at the cost of copying data between the different data areas associated
`with the two mailboxes. To avoid this overhead, we introduced an Enqueue operation that
`moves the message without copying the data. We also provided operations to "adjust" the
`size of messages in place, effectively removing a prefix or suffix of the message without
`doing any copying.
`Both Begin_Put and Begin_ Get block if no space or message is available. The
`calling thread is rescheduled when space becomes available or a message arrives. Interrupt
`handlers use non-blocking versions of these calls.
`The mailbox interface also allows a reader upcall to be attached to a mailbox. The
`upcall is invoked as a side effect of the End _Put operation. This flexibility allows us to
`
`7
`
`DEFS-ALA0010715
`Ex.1022.012
`
`DELL
`
`
`
`trade the concurrency of multiple threads against the overhead of context switching. For
`example, if a pair of threads uses a mailbox in a client-server style, the body of the server
`thread can instead be attached to the mailbox as a reader upcall; this effectively converts a
`cross-thread procedure call into a local one.
`
`Implementation of Mailboxes
`Mailboxes are implemented as queues of messages waiting to be read; buffer space for
`messages is allocated from a common heap. Allocating buffers from the heap provides
`better utilization of the CAB data memory since it is shared among all mailboxes on the
`CAB. As an optimization, each mailbox caches a small buffer; this avoids the cost of heap
`allocation and deallocation when sending small messages. The queue representation also
`allows us to implement the Enqueue operation by simply moving pointers.
`Mailbox operations from the host were initially implemented using the simple host-to(cid:173)
`CAB RPC mechanism described in Section 3.2. We also implemented a shared memory
`version in which mailbox data structures are updated directly from the hosL Since the
`reader and writer data structures are separate, mutual exclusion between CAB threads and
`host processes can be avoided as long as the readers either all reside on the CAB or all
`reside on the host, and the same for writers. This is certainly true in the common case of
`a single reader and a single writer, and also, more generally, for client-server interfaces
`across the host-CAB boundary.
`In return for the restrictions on placement of readers and writers, the shared memory
`implementation provides about a factor of two improvement over the RPC-based implemen(cid:173)
`tation for Sun 4 hosts. We have configured the runtime system so that both implementations
`coexist, and the appropriate implementation can be selected dynamically on a per-mailbox
`basis.
`
`3.4 Lightweight Synchronization
`Synchronization between two threads or processes does not always need the full generality
`of mailboxes. For example, returning a status value from a transport protocol on the CAB
`to a sender on the host could be done using a mailbox, but all that is really needed is a
`condition variable and a shared word for the value. Syncs allow a user to return a one-word
`value to an asynchronous reader efficiently; they are similar to Reppy's events [14].
`
`The Sync Interface
`The operations on syncs are Allee, Write, Read, and Cancel. Allee allocates a
`new sync. Write places a one-word value in the sync data structure, and marks the sync
`as written. Read blocks until a sync has been written, then frees the sync and returns its
`value. Alternatively, the reader can use Cancel to indicate that it is longer interested in
`the sync. Cancel frees the sync if it has been written; otherwise Cancel just marks the
`sync as canceled, leaving it to be freed as part of a subsequent Write.
`
`Implementation of Syncs
`Host processes and CAB threads allocate syncs in CAB memory; conflicts are avoided by
`using two separate pools of syncs. Since there is only one reader, reading a sync does not
`require any locking. Writing a sync does require a critical section: checking whether the
`
`8
`
`DEFS-ALA0010716
`Ex.1022.013
`
`DELL
`
`
`
`sync has already been canceled and marking the sync as written must be done atomically.
`On the CAB this is done by masking interrupts. A host process offloads the execution
`of Write to the CAB using the CAB signaling mechanism. Cancel is implemented
`similarly.
`
`3.5 The Application Interface: Nectarine
`Most of the current Nectar applications are written using Nectarine, the Nectar Interface.
`Nectarine is implemented as a library linked into an application's address space. It provides
`applications with a procedural interface to the Nectar communication protocols and direct
`access to mailboxes in CAB memory. It also allows applications to create mailboxes and
`tasks on other hosts or CABs. Nectarine simplifies the task of writing Nectar applications
`by hiding the details of the host-CAB interface and presenting the same interface on both
`the CAB and host.
`
`4 Protocol Implementation
`
`We have implemented several transport protocols on the CAB, including TCP/IP and a
`set of Nectar-specific transport protocols. The Nectar-specific protocols provide datagram,
`reliable message, and request-response communication. The reliable message protocol is
`a simple stop-and-wait protocol, and the request-response protocol provides the transport
`mechanism for client-server RPC calls.
`The implementation of the TCP/IP protocol suite serves as a good example of the use of
`the runtime system's features. Time-critical functions are performed by interrupt handlers
`and mailbox upcalls, most others by system threads. Mailboxes are used throughout for
`the management of data areas. The use of mailboxes proved advantageous in avoiding
`any copying of the data between receipt and presentation to the user.
`Although we
`only describe the implementation of TCP/IP, all the transport protocol implementations are
`structured in a similar fashion.
`
`4.1
`
`Internet Protocol
`IP input processing is performed at interrupt time. When a packet arrives over the fiber,
`the datalink layer reads the datalink header and initiates DMA operations to place the data
`into an appropriate mailbox. For IP packets, this is al ways the IP input mailbox. After the
`entire protocol header arrives, the datalink layer issues a start-of-data upcall to the protocol
`so that useful work can be done while the remainder of the packet is being received into
`the mailbox. IP uses this opportunity to perform a sanity check of the IP header (including
`computation of the IP header checksum).
`When the entire packet has been received, the datalink layer issues an end-of-data
`upcall.
`In this upcall, the IP input handler queues packets for reassembly if they are
`fragments of a larger datagram. The handler transfers complete datagrams to the input
`mailbox of the appropriate higher-level protocol. This transfer uses the mailbox Enqueue
`operation, so no data is copied.
`Higher-level protocols (including ICMP) are required to provide an input mailbox to
`IP; this mailbox constitutes the entire receive interface between IP and higher protocols.
`One advantage to this interface is that it allows the higher protocols to be implemented
`either as mailbox upcalls, which are called whenever their input mailbox is written, or as
`
`9
`
`DEFS-ALA0010717
`Ex.1022.014
`
`DELL
`
`
`
`separate threads, which block until the next packet anives. In our current system, ICMP is
`implemented as a mailbox upcall, while UDP and TCP each have their own server threads.
`While the receive interface between IP and higher protocols consists of a simple mailbox,
`the send interface is more complex. To send a packet, higher protocols are expected to
`call IP_Output with a header template, a reference to the data they wish to send, a flag
`indicating whether the data area should be freed once sent, and a route to the destination (if
`known). The header template must contain a partially filled-in IP header. Protocols may
`also append theirown header to the end of the tern plate. IP_ Output fills in the remaining
`fields in the IP header and calls the datalink layer to transmit the packet.
`
`4.2 Transmission Control Protocol
`The Nectar TCP implementation runs almost entirely in system threads, rather than at
`interrupt time. This allows shared data structures to be protected with mutual exclusion
`locks rather than by disabling interrupts. We plan to compare this approach to a strictly
`interrupt-driven implementation of TCP as part of the experiment discussed in Section 3.1.
`All TCP input processing is performed by the TCP input thread. This thread blocks on
`a Begin_ Get until a packet anives. Once it gets a packet, it examines the TCP header,
`checksums the entire packet, and performs standard TCP input processing. To pass data
`to the user, TCP simply deletes the headers and transfers the packet to the user's receive
`mailbox using the Enqueue operation.
`A user wishing to send data on an established TCP connection places a request in the
`TCP send-request mailbox. The data to be sent may be placed in the send-request mailbox
`following the request, or it may already exist in some other mailbox, in which case the
`user includes a pointer to it in the request. The TCP send thread on the CAB services this
`request by placing the data on the send queue of the appropriate connection and calling the
`TCP output routine. CAB-resident senders can do this directly without involving the TCP
`send thread.
`
`S Usage
`The flexibility of the CAB software architecture allows us to choose which layers in the
`protocol stack are handled by the CAB, effectively changing the interface the CAB presents
`to the host. Three such interfaces are described below, ordered in increasing degree of CAB
`functionality.
`
`5.1 Network Device
`The Nectar network can be used as a conventional, high-speed LAN by treating the CAB
`as a network device and enhancing the CAB device driver to act as a network interface. We
`have implemented a driver at this level for the Berkeley networking code [11], performing
`IP and higher-level protocols on the host as usual. The advantage of this approach is binary
`compatibility: all the familiar network services are immediately available.
`To perform networking functions, the device driver cooperates with a server thread on
`the CAB that is responsible for transmitting and receiving packets over Nectar. The driver
`and the server share a pool of buffers: to send a packet the driver writes the packet into a
`free buffer in the output pool and notifies the server that the packet should be sent; when
`
`10
`
`DEFS-ALA0010718
`Ex.1022.015
`
`DELL
`
`
`
`a packet is received the server finds a free input buffer, receives the packet into the buffer,
`and informs the driver of the packet's arrival.
`
`5.2 Protocol Engine
`The CAB can be used as a protocol engine by offloading transport protocol processing to
`the CAB. Several interfaces are possible on the host; these interfaces are independent of
`the particular transport protocols implemented on the CAB.
`The Nectarine interface that was described in Section 3.5 provides applications with a
`flexible communication model. Since it uses the host-CAB buffering and synchronization
`facilities directly, some or all of CAB memory must be mapped into applications' address
`spaces.
`The familiar Berlceley socket interface [13] is also being implemented at this level.
`Initially, an emulation library will be provided for applications that can be re-linked.
`Eventually, we will move this support into the UNIX kernel, which will intercept operations
`on Nectar connections and dispatch them to the CAB. This approach incurs the cost of
`system calls, but allows binary compatibility.
`Work is also in progress to support the Mach interprocess communication interface [I].
`Network !PC in Mach is provided by a message-forwarding server external to the Mach
`kernel; this server is a natural candidate for execution on the CAB.
`
`5.3 Application-level