`Communication
`Review
`
`‘26 9
`
`@
`
`Proceedings of
`ACM SIGCOMM’98
`CONFERENCE
`
`
`
`@&
`
`Y),
`
`Applications, Technologies, Architectures,
`and Protocols for Computer Communication
`Vancouver, British Columbia, Canada
`August 31 to September 4, 1998
`1
`
`APPLE 1038
`
`APPLE 1038
`
`1
`
`
`
`Computer
`Communication
`Review
`
`a publication of the acm special interest group on data communication
`Ceeeeeeeeeeeeeeeeeee
`
`Chairman:
`Jon Crowcroft
`Department of Computer Science
`University College London
`Gower Street, London
`WC1E 6BT UK
`Tel +44 71 380 7296
`j-crowcroft@cs. ucl.ac.uk
`
`Vice Chairman:
`
`Sally Floyd
`Lawrence Berkeley National Laboratory
`One Cyclotron Road
`Berkeley, CA 94720
`Tel: +1 510 486 7518
`floyd@ee. Ibl.gov
`
`Secretary-Treasurer:
`Patrick W. Dowd
`Departmentof Electrical Engineering
`A.V. Williams Building
`University of Maryland
`College Park, MD 20742
`Tel: +1 301 405 6750
`dowd@eng.umd.edu
`
`Editor:
`
`Martha Steenstrup
`BBN Technologies
`10 Moulton Street
`Cambridge, MA 02138
`Tel: +1 617 873 3192
`Fax: +1 617 873 6091
`msteenst@bbn.com
`
`Associate Editor:
`
`Lixia Zhang
`4531G Boelter Hall
`Computer Science Department
`UCLA
`Los Angeles, CA 90095
`Tel: +1 310 825 2695
`lixia@cs.ucia.edu
`
`Information Services Director:
`
`Greg Wetzel
`AT&T Bell Laboratories
`101 Crawfords Corner Road
`P.O. Box 3030
`Holmdel, NJ 07733-3030
`Tet +1 908 949 6630
`Infodir_SIGCOMM@acm.org, or
`G_F_Wetzel@att.com
`
`Award Committee Chairman:
`David C. Wood
`NATO C3 Agency
`P.O. Box 174
`2501 CD The Hague
`The Netherlands
`Tel +31 70 3142420
`Fax: +31 70 3142176
`woodd@nc3a. nato. int
`
`SIG Program Director:
`Pat McCarren
`ACM
`1515 Broadway, 17th Floor
`New York, NY 10036
`Tel: +1 212 626 0611
`Fax: +1 212 302 5826
`mecarren@acm.org
`
`Executive Committee:
`Jon Crowcroft
`
`Sally Floyd
`Patrick Dowd
`A. Lyman Chapin
`Martha Steenstrup
`Greg Wetzel
`
`ACM SIGCOMMLecturers:
`
`SIGCOMMMission Statement:
`
`Raj Jain
`Deepinder Sidhu
`lan Akyildiz
`
`SIGCOMM is ACM's professional
`forum for the discussion of topics in the
`field of communications and computer
`networks, including technical design
`and engineering, regulation and
`operations, and the social implications
`of computer networking. The SIG's
`
`members are particularly interested in
`the systems engineering and
`architectural questions of
`communication.
`
`Feedback on our mission is always
`welcome. Please email suggestions
`to any of the membersof the
`SIGCOMM Executive Committee.
`
`Advertising: ACM accepts recruitment advertising under the basic premisethat the advertising employer does not discriminate on the basis of age, color, race,religion, gender, sexual
`preference or national origin. ACM recognizes, however, that laws on such matters vary from country to country and contian exceptions, inconsistencies or contradictions This is as true of
`the laws of the United States of America asit is of other countries. Thus ACM policy requires each advertising employerto state explicitly in the advertisement any ermployment restrictions
`that may apply with respect to age. color,
`race.
`religion, gender, sexual preference,or nationalorigin. Observanceof the legal retirement age in the employer's country is not considered
`discrimination underthis policy
`For advertising information, please contact Walter Andrzejewski (Advertising Manager} at ACM, 1515 Broadway, New York, NY 10036 USA. tel +1 212 869 7440,fax: +1 212 869 0481
`
`COMPUTER COMMUNICATION
`REVIEW is a publication of the ACM
`Special Interest Group on Data
`Communication. Its scope of interest
`includes: data communication systems
`for computers; data communication
`technology for computers: reliability,
`security and integrity of data in data
`communication systems; prablems of
`interfacing communication systems and
`
`computer systems; computer
`communication system modelling and
`analysis.
`
`Items attributed to persons will
`ordinarily be interpreted as personal
`rather than organizational opinions.
`Technical papers appearing in
`Computer Communication Review are
`informally reviewed.
`
`COMPUTER COMMUNICATION REVIEW (ISSN 0146-4833)is publishedfive times a year (January, April, July, and two issues in October)
`by the ACM, Inc., 1515 Broadway, New York, NY 10036. Annual subscription cost of $14.44 is included in the member dues of $50.00 (for
`students, the cost is includec in
`$24.00); the non-member annual subscription is $37.00.
`Penodicals postage paid at New York, Y 10001, and at additional mailing offices.
`POSTMASTER: Send address changes to ACM COMPUTER COMMUNICATION REVIEW, 1515 Broadway, New York, NY 10036.
`
`2
`
`2
`
`
`
`
`
`SIGCOMM’9S
`
`Proceedings
`
`SCIENCE & ENGINEERING
`LIBRARY
`
`FEB 0 5 1999
`EMS COLLECTION
`UCLA
`
`
`
`Artin: Inert Bene
`
`Applications, Technologies, Architectures,
`and Protocols for Computer Communication
`http://www.acm.org/sigcomm/sigcomm9gs
`
`Vancouver, British Columbia, Canada
`September 2 to September 4, 1998
`(Tutorials August 31 and September 1)
`
`|G| SEKARD
`NBC” Bellcore
`- Bay Networks
`= @ newerpce §=—-PMC~---=CY)
`
`
`I
`
`3
`
`
`
`SIGCOMM 1998 in Vancouver, B.C Canada
`
`The Association for Computing Machinery, Inc.
`1515 Broadway
`New York, New York 10036
`
`Copyright © 1998 by the Association for Computing Machinery, Inc. (ACM). Permission to
`make digital or hard copies of portions of this work for personal or classroom useis granted
`without fee provided that the copies are not made or distributed for profit or commercial
`advantage and that copies bear this notice andthe full citation on thefirst page. Copyrights for
`components of this work owned by others than ACM must be honored. Abstracting with
`credit is permitted.
`
`To copy otherwise, to republish, to post on servers or to redistributeto lists, requires prior
`specific permission and/ora fee. Request permission to republish from: Publications Dept.
`ACM,Inc, Fax +1-212-869-0481 or E-mail permissions@acm.org.
`
`For other copying ofarticles that carry a code at the bottom ofthefirst or last page, copying
`is permitted provided that the per-copyfee indicated in the codeis paid through the Copyright
`Clearance Center, 222 Rosewood Drive, Danvers, MA 01923.
`
`ACM ISBN: 1-58113-003-1
`
`See also: http://www.acm.org/sigcomm/sigcomm98
`
`You mayorder additional copies — prepaid — from:
`
`ACM Order Department
`P.O. BOX 11414 Phone: 1-800-342-6626
`ChurchStreet Station
`(U.S.A. and Canada)
`New York, NY 10286 +1-212-626-0500
`(All other countries)
`Fax: +1-212-944-1318
`E-mail: acmpubs@acm.org
`
`ACM Order Number: 533980
`Printed in the U.S.A.
`
`Il
`
`4
`
`
`
`Router Plugins
`A Software Architecture for Next Generation Routers
`
`Dan Decasper', Zubin Dittia?, Guru Parulkar*, Bernhard Plattner’
`[dan|plattner] @tik.ee.ethz.ch, [zubin|guru] @ arl.wustl.edu
`‘Computer Engineering and Networks Laboratory, ETH Zurich, Switzerland
`Phone: +41-1-632 7019 Fax: +41-1-632 1035
`2Applied Research Laboratory, Washington University, St. Louis, USA
`Phone: +1-314-935 4586 Fax: +1-314-935 7302
`
`1. ABSTRACT
`Present day routers typically employ monolithic
`operating systems which are not easily upgradable
`and extensible. With the rapid rate of protocol
`developmentit is becoming increasingly important
`to dynamically upgrade router software in an incre-
`mental fashion. We have designed and implemented
`a high performance, modular, extended integrated
`services router software architecture in the NetBSD
`operating system kernel. This architecture allows
`code modules, called plugins, to be dynamically
`addedand configured at run time. Oneof the novel
`features of our designis the ability to bind different
`pluginsto individual flows; this allows for distinct
`plugin implementations to seamlessly coexist in the
`same runtime environment. High performanceis
`achieved through a carefully designed modular
`architecture; an innovative packetclassification
`algorithm that is both powerful and highlyefficient;
`and by caching that exploits the flow-like character-
`istics of Internet traffic. Compared to a monolithic
`best-effort kernel, our innplementation requires an
`averageincrease in packet processing overhead of
`only 8%, or 500 cycles/2.1ms per packet when run-
`ning on a P6/233.
`
`1.1 Keywords
`High performance integrated services
`router architecture, router plugins
`
`routing, modular
`
`2. INTRODUCTION
`New network protocols and extensions to existing protocols
`are being deployed on the Internet. New functionality is
`being added to modern ip routers at an increasingly rapid
`pace. In the past, the main task of a router was to simply
`forward packets based on a destination address lookup.
`Modernrouters, however, incorporate several new services:
`
`Permission to make digital or hard copies of all o1 part of this work for
`personal or classroom use is granted without fee provided thet
`copies are not made or distributed for profit or commercial adven-
`tage and that copies bear this notice and thefull citation on the first page.
`To copy otherwise, to republish, to post on servers or to
`redistribute to lists, requires prior specific permission and/or a tee.
`SIGCOMM ‘98 Vancouver, B.C.
`© 1998 ACM 1-58113-003-1/98/0008...$5.00
`
`Monolitic Best-Effort Architecture
`oul
`aa
`
`
`
`Modular Extendedintegraled Services Architecture
`Rouen ASP]
`t
`
`
`User
`Kernel
`
`
`
`
`Taurean ||
`aegating
`
`: Best Effort vs
`Figure 1.
`Extended Integrated Services Router (EISR)
`
`¢
`
`Integrated/difterentiated Services
`*
`¢ Enhanced routing functionality (level 3 and level 4 rout-
`ing and switching, QoS routing, multicast)
`Security algorithms (e.g.
`to implement virtual private
`networks(VPN))
`« Enhancements to existing protocols (e.g. Random Early
`Detection (RED))
`* New core protocols (e.g. Ipv6 [8])
`Figure 1 contrasts the software architecture of our proposed
`Extended Integrated Services Router (EISR) with that of a
`conventional best-effort
`router. A typical EISR kernel
`features the following important additional components: a
`packet scheduler, a packet classifier, security mechanisms,
`and QoS-based
`routing/Level
`4
`switching. Various
`algorithms and implementations of each component offer
`specific advantages in terms of performance, feature sets,
`and cost. Most of these algorithms undergo a constant
`evolution and are replaced and upgraded frequently. Such
`networking subsystem components are characterized by a
`relatively
`“fluid”
`implementation,
`and
`should
`be
`distinguished from the small part of the network subsystem
`code that remainsrelatively stable. The stable part (called the
`core) is mainly responsible for interacting with the network
`hardware
`and for demultiplexing packets
`to
`specific
`modules. Different implementations of the EISR components
`outside of the core often need to coexist. For example, we
`might want to use one kind of packet scheduling on one
`interface, and a different kind on another.
`
`In this paper, we propose a software architecture and present
`an implementation which addresses these requirements. The
`specific goals of our frameworkare:
`
`specific
`Implementation of
`¢ Modularity:
`comein the form of modulescalled plugins'.
`
`algorithms
`
`5
`
`
`
`¢ Extensibility: New plugins can be dynamically loaded
`at run time.
`
`The main contributions of our work are:
`
`e An innovative, modular, extensible, and flexible EISR
`Flexibility: Instances of plugins can be created, config-
`*
`networking subsystem architecture and implementation
`ured, and bound to specific flows. Plugins can beall-
`that introduces only 8% more overhead thanabest-effort
`kernel.
`software modules, or they can be software drivers for
`specialized custom hardware.
`* Performance: The system should provide for a very
`efficient data path, with no data copying, no context
`switching, and no additional interrupt processing. The
`overhead of modularity should not seriously impact per-
`formance.
`
`Our proposed framework has been implemented in the
`NetBsD UNIX kernel. This platform was selected because of
`its portability (all major hardware platforms are supported),
`efficiency, and extensive documentation. In addition, we
`found state-of-the-art implementations on this platform for
`ipvG [13] and packet schedulers [27, 5]
`that could be
`integrated into our framework.
`
`We envision several applications for our framework. First,
`our architecture fits very well into the operating system of
`small and mid-sized routers. It is particularly well suited to
`the implementation of modern edge routers
`that are
`responsible for doing flow classification, and for enforcing
`the configured profiles of differential service flows. This
`kind of enforcement can be doneeither on a per-application
`flow basis, or on a generalized class-based approach (e.g.
`CBQ [11]). Our
`implementation supports both models
`efficiently.
`
`Ourframework is also very well suited to Application Layer
`Gateways (ALGs), and to security devices like Firewalls. In
`both situations, it is very important to be able to quickly and
`efficiently classify packets into flows, and to apply different
`policies to different flows:
`these are both things that our
`architecture excels at doing.
`
`Yet another application of our framework is for network
`management applications, which typically need to monitor
`transit traffic at routers in the network, and to gather and
`report various statistics thereof. For such applications, it is
`important to be able to quickly and easily change the kinds
`of statistics being collected, and to do this without incurring
`significant overhead on the data path.
`
`in
`Finally, while our proposed framework is very useful
`real-world implementations, its modularity and extensibility
`also makeit an invaluable tool for researchers. We plan to
`release all of our code in the public domain and we will
`attempt
`to incorporate several core portions
`into the
`standard NetBsp distributiontree.
`
`' 4 note on ouruseof the word ‘plugin’ (instead of ‘module’) is in order.
`In the web browser world, a plugin is a software modulethat is dynami-
`cally linked with the browser and is responsible for processing certain
`types of application streams (or flows). In a similar fashion, our router
`plugins are kernel software modulesthat are dynamically loaded into the
`kernel and ate responsible for performing certain specific functions on
`specified network flows.
`
`*
`
`* A very fast packet classifier algorithm which provides
`highly competitive upper boundsfor classification times.
`With a very large number offilters (in the order of
`50000), it classifies IPv6 packets in 24 memoryaccesses,
`and is muchfaster for smaller numbers offilters.
`Implementations of plugins
`for
`two state-of-the-art
`packet schedulers: Deficit Round Robin (pre, [23]) for
`fair queuing, and the Hierarchical Fair Service Curves
`(H-FSC, [27]) scheduler for class-based packet schedul-
`ing; Implementationof plugins for IP security [2].
`There are a few commercial attempts that we are aware of
`which follow similar lines. The latest versions of Cisco’s
`Internet OS
`(ios,
`[6]) claims
`to fulfill
`some of
`the
`requirements, butsince it’s a commercial operating system,
`there is no easy access for the research community and these
`claims are not verifiable. Microsoft’s Routing and Remote
`Access Service for Windows NT (RRAS, previously referred
`to as “Steelhead” {18, 19]) is an attempt to implementrouter
`functionality under Windows NT. RRAS exports an API and
`allows third party modules to implement routing protocols
`like OSPF and SNMPagents in user space. The API does not
`provide an interface to the routing and forwarding engines,
`and the platform offers no integrated services components.
`A few research projects attempt to achieve someofthe goals
`mentioned above [12, 20, 21]. Most of them are focused on
`the implementation of modular end-system networking
`subsystemsinstead of routing architectures. Scout from the
`University of Arizona is a particularly interesting project
`based on the x-kernel that implements an operating system
`targeted at network appliances (including routers). It comes
`with router components implementing simple QoS support.
`Since the whole operating system is implemented from
`scratch, most of
`the provided functionality is over-
`simplified and does not provide the large feature set that is
`found in mature implementations. We discuss these related
`approaches in more detail in [7].
`
`In Section 3, we describe ourarchitecture and explain howit
`achieves modularity, extensibility, and flexibility while
`maintaining high-performance. In Section 4, we describe
`the implementation of a module called the Plugin Control
`Unit
`(Pcu), which is
`responsible for all controi path
`interactions with
`plugins.
`SectionS
`outlines
`the
`implementation of the Association Identification Unit (AIU),
`which is used by almostall other componentsin our design.
`The AIU implements an innovative algorithm for packet
`classification which efficiently maps packets
`to code
`modules (plugins). In Section 6, we elaborate on example
`plugins (packet schedulers) which we implemented or
`adapted
`for
`our
`environment.
`Section?
`presents
`performanceresults from our implementation, and Section 8
`summarizes our ideas.
`
`230
`
`6
`
`
`
`3. OVERALL ARCHITECTURE
`The primary goal of our proposed architecture was to build a
`modular
`and
`extensible
`networking
`subsystem that
`supported the concept of flows, and the ability to select
`implementations of components based upon flows
`(in
`addition to simple static configurations). Becausc the
`deployment of multimedia data sources and applications
`(e.g. real-time audio/video) will produce longerlived packet
`streams with more packets per session than is common in
`today’s
`environment,
`an
`integrated
`services
`router
`architecture should support the notion of flows and build
`uponit. In particular, the locality properties of flows should
`be effectively exploited to provide for a highly efficient data
`path. Our plugin framework features:
`
`* Dynamic loading and unloading of plugins at run time
`into the networking subsystem. Plugins are code mod-
`ules which implement a specific EISR functionality (e.g.
`packet scheduling). NetBsp offers a simple yet powerful
`mechanism which allows modules to be loaded into the
`kernel which is used to load our plugins into the kernel.
`Oncea plugin is loaded,it is no different from any other
`kernel code. Whatis required for our system is a compo-
`nent which glues the individual plugins to the network-
`ing subsystem, and which provides a control-path
`interface used by other kernel components (possibly also
`other plugins) and user space daemonsto talk to the
`plugin.
`In our system,
`this component
`is called the
`Plugin Control Unit (Pcu). The pcU hides some of the
`implementation specific details from the individual plu-
`gins and allows them to access the system in a simple yet
`flexible fashion.
`
`« Creation of individual instances of plugins for maximal
`flexibility. An instance is a specific run-time configura-
`tion of an individual plugin. It is often very desirable to
`have multiple instances of one and the same plugin con-
`currently in the kernel. For example, consider packet
`scheduling. A packet scheduler can work with different
`configurations on different network interfaces. State-of-
`the-art packet schedulers are usually hierarchical, with
`possibly different modules working on different levels of
`the scheduling hierarchy. Among the nodes ofthe same
`level, modules are specifically configured, which means
`that they coexist in our framework as plugin instances.
`In order to provide a simple and unified interface for the
`allocation of multiple instances of one and the same
`plugin, the plugins must respond to a set of standardized
`messages. By standardizing this message set and imple-
`menting it in all plugins, we guarantee interoperability
`amongdifferent plugins and provide a simple configura-
`tion interface.
`
`° Efficient mapping of individual data packets to flows,
`and the ability to bind flows to plugin instances. Sets of
`flows are specified using filters. For example, a filter
`might match all TCP traffic from the network 129.0.0.0
`to the host 192.94.233.10. Filters can also match individ-
`ual end-to-end application flows. Filters are specified as
`six-tuples:
`
`<source address, destination address, protocol, source
`port, destination port, incoming interface>
`
`Any of the fields in the six tuple may be wildcarded.
`Additionally, for network addresses, a prefix mask may
`be used to partially wildcard the corresponding field. For
`instance, for the above example, the filter specification
`would read: </29.* **, 192.94.233.10, TCP, *, *, *>
`
`the filter for an end-to-end application flow
`Clearly,
`would have all
`fields (except perhaps the incoming
`interface) fully specified. We will see later in this section
`that a packet matching a particular filter will be passed
`to the plugin instance that has been boundto thatfilter.
`This will be shown to happen whenever the packet
`reaches a “gate” in the 1p stack; a gate can be thought of
`as the entry point for a plugin.
`
`* Overall high performance. High performance is guaran-
`teed only in part through a fully kernel space implemen-
`tation which prevents costly context
`switches. We
`identified two other critical properties which, when com-
`bined, guarantee high performance even in a highly
`modular environment: the flow-like nature of most inter-
`net traffic, and the ability to classify packets into flows
`quickly and efficiently. As we show below,
`the filter
`lookup to determine the right plugin instance to which a
`packet should be passed happensonlyforthe first packet
`of a burst. Subsequent packets get this information from
`a fast flow cache which temporarily stores the informa-
`tion gathered by processing the first packet. The filler
`lookupitself is efficiently implemented using a Directed
`Acyclic Graph (DAG). We elaborate on these techniques
`later in this section, and also in section 5.
`* Easy integration with custom hardware for high perfor-
`mance processing of specialized tasks. This is enabled
`by plugins which are software drivers for hardware that
`implements the desired functionality. For example, a
`plugin could control hardware engines for tasks such as
`packetclassification or encryption.
`In order to describe our framework, we first look at the
`different components and how they interact in the control
`path. In the Section 3.2, we will look at the data path, and
`how individual packets are processed by our architecture.
`
`3.1 The Control Path
`Figure 2 shows the architecture of our system and the
`control communication between different components. A
`description of the different components follows:
`¢
`[Pv4/IPv6 core: The ipv4/fipv6 core consists of a
`stream-lined IPv4/Ipv6 implementation which contains
`the (few) components required for packet processing
`which do not comein the form of dynamically loadable
`modules. These are mainly functions that interact with
`network devices. The core is also responsible for demul-
`tiplexing individual packets to plugins as we will show
`in the next section. There are no plugin related control
`path interactions with the IP core.
`
`231
`
`7
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`ae mean
`massage
`te plugins
`To rte—Sastanés
`nner
`registars
`<{,
`callback
`
`
`involves the following steps:
`akUbie:
`i
`U Plugin
`“dansger-)(5.2MAVPSRMaatedremons
`eee Hn ee eee
`¢ Loading a plugin: Using the medload command, which
`L
`L
`‘
`sel
`i
`is part of the NetsspD distribution, plugins are loaded into
`the kernel. Onloading, they register themselves with the
`PCU by providing a callback function. This function is
`used to send messagesto the plugin. There are messages
`for creating and freeing instances of the plugin and for
`binding plugin instances to flows. Also, plugin develop-
`ers can define an arbitrary number of plugin specific
`messages. Once the callback function for a plugin has
`been registered, the PCU can forward these configuration
`messagesto the plugin.
`* Creating an instance of a plugin: Using the Plugin
`Manager application, configuration messages can be
`sent to specified plugins. Typically, these messages ask
`the plugin to create an instance ofitself. In case of a
`packet scheduling plugin for example, the configuration
`information could include the network interface the
`plugin should work on.
`* Creating filters: Once a plugin has been configured and
`an instance has been created,it is ready to be used. What
`has to be defined next is the set of datagrams which
`should be passed to the instance for processing. This is
`done by binding one or more flows to the plugin
`instance. To specify the set offlows that are supposed to
`be handled by a particular plugin instance, the Plugin
`Manageror one of the user space daemons(RSVPor SSP)
`can create filters through calls to the Al. Recall (from
`earlier in this section) that a filter is a specification for
`the set of flows it matches.
`* Binding flowsto instances: Next, the binding between
`filters and plugin instances mustbe established. Eachfil-
`ter in the ATU is associated with a pointer to a plugin
`instance; this pointer is set by making anothercall to the
`AIU to do the binding.
`Now the system is ready to process data packets. We will
`show in the next subsection how data packets are matched
`againstfilters and how they get passed to the appropriate
`instances.
`
`Figure 2.
`
`: System Architecture and Control Path
`
`* Plugins: Figure 2 shows four different types of plugins -
`plugins implementing Ipv6 options, plugins for packet
`scheduling, plugins to calculate the best-matching prefix
`(BMP, used for packet classification and routing), and
`pluginsforIP security. Other plugin typesare also possi-
`ble: e.g., a routing plugin, a statistics gathering plugin
`for network managementapplications, a plugin for con-
`gestion control (RED), a plugin monitoring TCP conges-
`tion backoff behaviour, a firewall plugin. Note that all
`plugins comein the form of dynamically loadable kernel
`modules.
`
`* Plugin Control Unit (PCU): The pcu managesplugins,
`and is responsible for forwarding messagesto individual
`plugins from other kernel components, as well as from
`user space programs(usinglibrary calls).
`* Association Identification Unit: The Association Iden-
`tification Unit (AlU) implements a packet classifier and
`builds the glue between the flows and plugin instances.
`The operation of the Alu will become clear when we
`describe the data path in the next subsection.
`¢ Plugin Manager: The Plugin Manageris a user space
`utility used to configure the system.It is a simple appli-
`cation which takes arguments from the command line
`and translates them into calls to the user-space Router
`Plugin Library which we provide with our system. This
`library implements the function calls needed to config-
`ure all kernel
`level components.
`In most cases,
`the
`plugin manager is invoked from a configuration script
`during system initialization, but it can also be used to
`manually issue commandsto various plugins. We show
`an example of how the Plugin Manager is used in
`Section 6.
`
`3.2 The Data Path
`Data packets in our system are passed to instances of
`plugins which implement
`the
`specific
`functions
`for
`processing the packets. Since data path mechanisms are
`applied to every single packet,
`it
`is very important
`to
`optimize their performance. Given a packet, our architecture
`should be able to quickly and efficiently discover the set of
`instances that will act on the packet.
`
`The data path interactions are shown in Figure 3.Before we
`can explain the sequence of actions, we have to introduce
`the notion ofa gate.
`
`A gate is a point in the IP core where the flow of execution
`branches off
`to an instance of a plugin. From an
`implementation point of view, gates are simple macros
`which encapsulate function calls to the ATU that will return
`
`* Daemons: ‘he rSvP [31], SSP [1] (a simplified version
`of RSVP), and route daemonare linked against the Router
`Plugin Library to perform their respective tasks. We
`implemented an ssP daemon for our system, and are cur-
`rently in the process of porting an RSVP implementation.
`After a reboot, the system has to be configured beforeit 1s
`ready to receive and forward data packets. Configuration
`involves the selection of a set of plugins. Since a selection
`does not necessarily apply to all packets traversing the
`router, a definition of the sct of packets which should be
`processed by each individual plugin instance is required.
`This configuration can be done either by a
`system
`administrator, or by executing a script. Configuration
`
`Nw
`
`8
`
`
`
`
`cP
`- As
`[ wanSgee
`2
`
`“RSVP/SSP ansman
`
` fecE
`
`
`
`
`}
`
`
`
`
`
`lacalle @
`
`inotance
`
`
`
`
`
`
`
`
`
`
`Figure 3.
`
`: System Architecture and Data Path
`
`to be used for
`the correct plugin instance which is
`these macros can
`processing the packet.
`In many cases,
`avoid a function call
`to the Alu altogether,
`thereby
`permitting a more efficient
`implementation. Gates are
`placed wherever interactions with plugins need to take
`place. For example, sometimesafter a packet is received by
`the hardware,
`IP security processing has to be done if the
`system is configured as entry point into a virtual private
`network.
`In
`our
`system,
`IP
`security
`functions
`are
`modularized and come in the form of plugins. A gate is
`inserted into the IP core code in place of the traditional call
`to
`the kernel
`function responsible for
`IPv6_
`security
`processing. In our current implementation, we use gates for
`ipv6 option processing, IP security, packet scheduling, and
`for the packetfilter’s best-matching prefix algorithm.
`
`To follow the various data path interactions, it is important
`to get a basic understanding of the operation of the AIU. The
`AIU is responsible for maintaining the binding between
`flows and plugin instances. It makes use of a special data
`structure called a flow table to cache flows. Flow tables
`allow for very fast lookup times for arriving packets that
`belong to cached flows.
`
`In the AIU, all flows start out being uncached (ie., they do
`not have an entry in the flow table). If an incoming packet
`belongs to an uncached flow, its lookup in the flow table
`data structure will fail (i.e., there is a cache miss). In this
`case, the packet needs to be looked up in a different data
`structure that we call a filter table. Filter tables store the
`bindings betweenfilters and plugins for each gate. The filter
`table lookup algorithm finds the most specific matching
`filter (described later) that has been installed in the table,
`and returns the corresponding plugin instance. Usually,filter
`table lookups are much slowerthan flow table lookups. An
`entry for a flow in the flow table serves as a fast cache for
`future lookups of packets belonging to that flow. Each flow
`table entry stores pointers to the appropriate plugins for all
`gates that can be encountered by packets belonging to the
`corresponding flow. The processing of the first packet of a
`new flow with n gates involves nfilter table lookups to
`create a single entry in the flow table for the new flow.
`
`If a cached flow remains idle (i.e., no new packets are
`received) for an extended period, its cached entry in the flow
`table data structure may be removed (or replaced by a
`different flow).
`In this case, if the flow becomes active
`
`i)
`
`ore)
`
`again,the first packet that is received would again result in a
`cache miss, which would again cause a new cache entry to
`be created in the flow table so that subsequent packets can
`benefit from faster lookup times.
`
`table lookup
`filter
`a very fast
`Section 5.1 describes
`implementation based on directed acyclic graphs (DAGs).
`Section 5.2 describes our flow table implementation, which
`is based on hashing.
`
`As an example, consider the steps involved in processing an
`Ipv6 packet (see numbers 1-6 in Figure 3). Uncached flow
`processing involves the following sequence of events and
`actions:
`
`3.
`
`0. Packet arrival: When a packetarrives, it gets passed to
`the IP core by the network hardware. As it makesits
`waythroughthe core, it may encounter multiple gates.
`1. Encountering a gate: Assume that
`the packet has
`reached the gate where IP security processing will be
`handled. The task of this gate is to find the plugin
`instance which is responsible for applying security pro-
`cessing (authentication and/or encryption) to the packet.
`2. Discovering the right instance: The gate makesa call
`to the AIU. The parametersofthecall are a pointer to the
`packet and an identification of the gate issuing the call.
`In our case, we would identify the IP security gate as the
`caller.
`Packetclassification: The AlU first does a lookupin the
`flow table, and finds that there is no cached entry avail-
`able for the flow. Consequently, it performs a lookup in
`the filter table correspondingto the IP security gate. The
`resulting plugin instance pointer is returned to the call-
`ing gate (“SEC2” in Figure 3). Note that since this
`packet classification step performed by the AIU is the
`most expensive step in the whole cycle, an efficient
`packet classification scheme and implementation is
`important.
`4. Caching of the instance pointer: Before the AIU
`returns the instance pointer to the gate,
`it sto