`Varghese et al.
`
`54 METHOD AND APPARATUS FOR FAST
`HIERARCHICAL ADDRESS LOOKUPUSING
`CONTROLLED EXPANSION OF PREFIXES
`
`75 Inventors: George Varghese, St. Louis;
`Srinivasan Venkatachary, University
`City, both of Mo.
`73 Assignee: Washington University, St. Louis, Mo.
`
`21 Appl. No.: 08/821,100
`22 Filed:
`Mar 20, 1997
`(51) Int. Cl." ..................................................... H04L 12/46
`52 U.S. Cl. ...................... 370/392; 370/401; 395/200.75
`58 Field of Search ..................................... 370/392,393,
`370/401, 402, 403, 404, 405, 466, 467;
`395/200.68, 200.75
`
`56)
`
`References Cited
`
`U.S. PATENT DOCUMENTS
`
`4,897,814 1/1990 Clark ......................................... 365/49
`5,247,620 9/1993 Fukuzawa et al. ..
`... 370/402
`5,261,090 11/1993 Lien .....................
`... 395/600
`5,313,465 5/1994 Perlman et al. .....
`... 370/403
`5,386,413
`1/1995 McAuley et al.......................... 370/54
`5,414,704 5/1995 Spinney .................................... 370/60
`5,446,887 8/1995 Berkowitz ...
`... 395/600
`5,459,717 10/1995 Mullan et al. ............................ 370/54
`5,761,440 6/1998 De Marco et al. ......
`... 370/392
`5,781,772 7/1998 Wilkinson, III et al. ...
`... 370/229
`5,842,224 11/1998 Fenner .....................
`... 370/392
`5,856,974
`1/1999 Gervais et al. ......................... 370/401
`OTHER PUBLICATIONS
`Perlman, Radia, 9.7 Address Matching, Interconnection
`Bridges and Routers, pp. 233-239.
`Wright, Gary R., Radix Tree Routing Tables, TCP/IP Illus
`trated, Volume 2, The Implementation, pp. 559-569.
`
`US00601 1795A
`Patent Number:
`11
`(45) Date of Patent:
`
`6,011,795
`Jan. 4, 2000
`
`Mills, Don, The Art of Computer Programming, Volume
`3/Sorting and Searching, pp. 481–499.
`
`Primary Examiner Min Jung
`Attorney, Agent, or Firm Howell & Haferkamp, L.C.
`
`57
`
`ABSTRACT
`
`Many network protocols, including the Internet, have
`addresses that are structured hierarchically. The hierarchy is
`expressed by an address prefix P that represents all addresses
`in the given hierarchical level that starts with prefix P. The
`hierarchy is not Strict and can be overridden by more
`inclusive hierarchies. This is achieved by having network
`routers find the longest prefix that matches a destination
`address in a message.
`
`The disclosed invention describes a method and apparatus
`for implementing controlled expansion: for expanding a Set
`of prefixes into an equivalent (possibly larger) set of prefixes
`that have a smaller set of prefix lengths. The new set of
`prefixes can then be looked up significantly faster using any
`technique whose Speed improves by reducing the number of
`prefix lengths. Our invention also incorporates fast tech
`niques to insert and delete new prefixes, and a technique of
`pointer hoisting to halve the memory READs needed for trie
`Search. Our Solution to longest matching prefix also applies
`to other routing protocols Such as OSI Routing, call routing
`in telephone networks, and to String matching problems.
`
`20 Claims, 10 Drawing Sheets
`
`
`
`
`
`
`
`
`
`L2
`
`OOk
`Ot
`
`101
`
`
`
`P4
`
`
`
`EX.1096.001
`
`DELL
`
`
`
`U.S. Patent
`
`Jan. 4, 2000
`
`Sheet 1 of 10
`
`6,011,795
`
`Source
`
`
`
`San
`Fransisco
`
`California
`
`
`
`
`
`Switch packet
`
`-
`
`- -
`
`
`
`
`
`
`
`
`
`
`
`
`
`Destination ink
`
`
`
`
`
`
`
`ROuter R
`
`FIG. 2
`
`EX.1096.002
`
`DELL
`
`
`
`U.S. Patent
`
`Jan. 4, 2000
`
`Sheet 2 of 10
`
`6,011,795
`
`
`
`
`
`
`
`Destination ink
`
`
`
`USA. CA
`USA. MA. BOSton
`
`L2
`
`L2
`
`FIG. 4.
`
`
`
`
`
`P6
`
`
`
`L6
`100
`10111 * T L2
`F.G. 5
`
`1 *
`L2
`Length 1
`10* L3
`Length 2
`|Length 3 | 101* L2
`length 5
`10111. L7
`
`01* L1
`100* L2
`
`FIG. 6
`
`EX.1096.003
`
`DELL
`
`
`
`U.S. Patent
`
`Jan. 4, 2000
`
`Sheet 3 of 10
`
`6,011,795
`
`Root Node
`Node 1
`
`P
`
`O
`1.
`
`11' e. 0
`
`Node 2
`P3
`O
`1.
`
`Node 3
`
`1.
`
`a P6
`
`FIG. 7
`
`
`
`
`
`
`
`
`
`EX.1096.004
`
`DELL
`
`
`
`U.S. Patent
`
`Jan. 4, 2000
`
`Sheet 4 of 10
`
`6,011,795
`
`Pointer BestPrefix
`000
`001
`010 P1 (L1)
`o11 P1 (L1)
`i 100
`101 P4 (L.2)
`110
`
`Pointer BestPrefix
`
`
`
`
`
`
`
`110 P6 (L7)
`
`EX.1096.005
`
`DELL
`
`
`
`U.S. Patent
`
`Jan. 4, 2000
`
`Sheet 5 of 10
`
`6,011,795
`
`
`
`Destination
`(100111)
`
`FIG 11
`
`Link Level Header-uul
`
`nony Header
`6
`FIG. 12
`
`EX.1096.006
`
`DELL
`
`
`
`U.S. Patent
`
`Jan. 4, 2000
`
`Sheet 6 of 10
`
`6,011,795
`
`
`
`OutputLink 6
`
`O
`
`Outputlink in
`
`EX.1096.007
`
`DELL
`
`
`
`1. Do controlled expansion of prefixes
`l
`Expanded s S' of prefixes with
`fewer distinct lengths.
`Petro?co,
`Embodiment
`Embodiment
`Build Trie
`for set S (
`
`
`
`
`
`Build data structure
`for lookups using S'
`
`U.S. Patent
`
`Jan. 4, 2000
`
`Sheet 7 of 10
`
`6,011,795
`
`Original Set S of Prefixes
`provided by routing updates
`
`Other
`Embodiments
`
`Fraion
`Fastie Lookup Lookup addresses
`of addresses (3)
`
`
`
`- - - - -------
`
`FIG. 14
`
`W = Max
`Prefix length
`
`
`
`
`
`
`
`All prefixes of
`length L
`
`EX.1096.008
`
`DELL
`
`
`
`U.S. Patent
`
`Jan. 4, 2000
`
`Sheet 8 of 10
`
`6,011,795
`
`FOREACHORIGINAL (Prefix, Link) PAIR
`PLACE (Prefix, Link) IN LIST STARTING
`1 ATPosition L OF Table WHERE LIS
`LENGTH OF Prefix
`
`2 Currentlength = 0
`re IS Currentlength DIVISIBLE BY Stridellength X?
`es
`4.
`ls List Current ength Empty?
`LET (P,L) BE FIRSTELEMENT OF
`LIST STARTING AT POSITION Current ength
`EXPAND PENTO TWO PREFIXES PO and P1
`IF PO IS NOT IN LIST CurrentLength + 1 THEN
`ADD (P0, L) TO LIST Currentlength + 1
`
`6
`
`5
`
`FP1 IS NOT IN LIST Currentlength -- 1 THEN
`ADD (P1, L) TO LIST Currentlength + 1
`9 DELETE (PLFROM List Currentlength
`Current ength:= Currentlength + 1
`urrentlength > Maximum Prefix ength?
`
`O
`
`
`
`1.
`
`No
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`
`Output all (Prefix, Link) pairs in Table
`
`FIG 16
`
`New N
`prefix P
`
`FIG 17
`
`EX.1096.009
`
`DELL
`
`
`
`U.S. Patent
`
`Jan. 4, 2000
`
`Sheet 9 of 10
`
`6,011,795
`
`Pointer Best Prefix
`
`ROO
`
`01.0
`011 P1 (L1)
`P5 (IL6)
`101 P4 (L.2)
`110 P2 (Li)
`111 P2 (L.2)
`
`Pointer BestPrefix
`
`
`
`
`
`
`
`
`
`P7 (L3)
`
`110
`111
`
`P6 (7)
`P6 (7)
`
`FIG. 18
`
`Root
`
`N Path(N), path from root to node N
`N
`Ya
`Pointer BestPrefix
`
`
`
`node N
`
`Position J
`
`Child of node N
`
`positions
`
`
`
`FIG
`
`9
`
`EX.1096.010
`
`DELL
`
`
`
`U.S. Patent
`
`Jan. 4, 2000
`
`Sheet 10 of 10
`
`6,011,795
`
`NPUT Address
`
`INPUT STRIDEX 2
`
`CurrentNode := ROOT 3
`BestPrefixSeen := NL
`
`EXTRACT NEXT X BITS
`OF ADDRESS ECUAL TO
`
`5 INDEX INTO POSITIONJ
`OF CurrentMode
`
`CurrentMode := PointerJ
`
`yes
`
`SBestPrefix J = NIL?
`O
`BestPrefixSeen := BestPrefix (J) 7
`
`6
`
`
`
`5 is Pointer (J) - NIL?
`
`8
`
`Yes
`
`RETURN BestPrefixSeen
`
`FIG. 20
`
`EX.1096.011
`
`DELL
`
`
`
`1
`METHOD AND APPARATUS FOR FAST
`HIERARCHICAL ADDRESS LOOKUP USING
`CONTROLLED EXPANSION OF PREFIXES
`
`6,011,795
`
`2
`1. We show a schematic description of router R in FIG. 2.
`When a message arrives on Say link L4, the message carries
`its destination address (San Francisco) in its message header.
`Router R is a special computer whose job is to forward all
`messages that are Sent to it towards their final destinations.
`To do so, router R consults a Forwarding Table (sometimes
`also called a Forwarding Database). This is a table in the
`memory of R, which lists each possible destination and the
`corresponding output link. Note that the
`Forwarding Table
`contents are consistent with FIG. 1.
`Thus when a message to San Francisco arrives on Link
`L4, router R looks up the destination address San Francisco
`in its forwarding table. Since the table says
`L2, the router
`then Switches the entire message to the output link L2. It
`then proceeds to Service the next arriving message. Notice
`that so far the word “lookup” is no different from looking up
`a word in a dictionary or a phone number in the phone book.
`We will show it is more difficult than dictionary or phone
`book lookup shortly.
`Thus the two main functions of a router are to lookup
`destination addresses (address lookup) and then to Send the
`packet to the right output link (message Switching). To be
`more precise, there are Some additional chores Such as
`incrementing a visit count in a message, but these chores are
`fairly trivial compared to lookup and Switching. Both must
`be done at very high Speeds. Fortunately, the problem of
`message Switching is very well understood in recent years
`because of advances in ATM Switching Technology. Eco
`nomical gigabit message Switching is quite feasible today
`because of the work of others. (Thus one can imagine a
`router as having an ATM core to Switch packets.)
`We have already seen that of the two main functions of a
`router, message Switching is a Solved problem and already
`available in many commercial products. Despite this, the
`problem of doing address lookups at Gigabit speeds
`remains. Current vendor Speeds for lookups are quite slow.
`For example, existing products we know of use hardware
`assistance for lookups and can take up to 3 uS for a single
`lookup in the worst case and 1 uS on average. Our invention,
`on the other hand, gives ten times faster address lookup
`performance (lookups in around 0.1 us).
`Before we describe how our invention works, it is impor
`tant to understand why Internet address lookup is hard. It is
`hard for two reasons. First, Internet addresses are not
`Specially created (like ATM addresses) to be easy to lookup.
`Second, the Internet deals with Scaling issues by using
`address prefixes which requires a more complex lookup. We
`describe details below.
`First, looking up Internet addresses is a lot harder than Say
`looking up ATM addresses. ATM addresses (VCIs) are
`carefully chosen to be simple to lookup in Switch tables.
`Unfortunately, ATM addresses must be set up for each
`conversation which adds delay; by contrast, Internet
`addresses (like postal addresses) are relatively fixed and
`there is no additional Set up delay per conversation.
`Secondly, ATM addresses do not currently make much
`provision for hierarchical networks and So are perceived not
`
`to be scalable to truly global networks. IP,
`through the use
`of prefixes (see below), has provision for Scaling. Thus for
`various reasons, the Internet and ATM Seem to be each going
`their own way. In the future, they are likely to coexist with
`ATM backbones and ATM LANs in the Internet. So, with
`respect to ATM, i) IP address lookup is a lot harder and ii)
`the Internet is unlikely, if at all, to change completely to
`ATM.
`The Second thing to realize is that the Internet lookup
`problem is a lot harder than looking up a phone number in
`
`1O
`
`15
`
`40
`
`45
`
`25
`
`BACKGROUND OF THE INVENTION
`The Internet is becoming ubiquitous: everyone wants to
`join in. Statistics Show that the number of computers on the
`internet is tripling approximately every two years. Traffic on
`the Internet is also increasing exponentially. Traffic increase
`can be traced not only to increased hosts, but also to new
`applications (e.g., the Web, Video conferencing, remote
`imaging) which have
`higher bandwidth needs than tradi
`tional applications. One can only expect further increases in
`users, computers, and traffic. The possibility of a global
`Internet with multiple addresses per user (e.g., each user
`may have several appliances on the Internet) has necessi
`tated a transition from the older Internet routing protocol
`(called IPv4, with small 32 bit addresses) to the proposed
`next generation protocol (called IPv6, with much larger 128
`bit addresses).
`The increasing traffic demand placed on the network
`forces two key factors to keep pace: first, the Speed of
`communication links, and Second, the rate at which routers
`(routers are boxes that route messages
`in the Internet, very
`much like automated Post Offices in the postal network) can
`forward messages. With the advent of fiber optic links, it is
`easily and economically possible to Solve the first problem.
`For example, MCI is currently upgrading its Internet back
`bone links from 45 Mbits/s to 155 Mbits/s, and they plan to
`Switch to 622Mbits/s within a year. However, improving the
`Speed of communication linkS is insufficient unless router
`forwarding Speeds increase proportionately.
`Today's fastest routers forward messages at a maximum
`rate of 100,000 to 500,000 messages a second. However,
`35
`communication link Speeds are already reaching Speeds of 1
`Gigabit/sec (1 Gigabit=1000 million bits per second). A
`router has to forward 5 million messages (of average size say
`200 bits) per second to keep up with the speed of a Gigabit
`link. With the popularity of the Internet and the larger traffic
`Volumes expected, router vendors wish to increase the
`forwarding performance of their routers.
`The major problem that routerS face in forwarding an
`Internet message is Something known as address lookup. To
`understand this, we must first have an intuitive idea of what
`a router does. Consider a hypothetical fragment of the
`Internet linking users in Europe with users in the United
`States. Consider a Source user (see label called Source in the
`left of FIG. 1) in Paris. If this user wishes to send say an
`email message to San Francisco, the user will Send its
`message to a router R1 which is, say, in Paris. The Paris
`router may send this message on the communication link L4
`to router R, say in London. The London Router R may then
`Send the message on link L2 to router R3 in San Francisco;
`R3 then sends the message to the destination.
`Notice how a message travels from Source to destination
`alternating between communication links and routers. This
`is almost identical to the way a postal letter travels from post
`office to post office using Some communication channel
`(e.g., an airplane). How does each post office decide where
`to forward the letter? Each post office does So, using the
`destination address that is placed on the envelope containing
`the letter. In the same way, routers must decide to forward
`a message based on a destination address that is placed in an
`easily accessible portion of the message called a header.
`With this context, we can understand how a router for
`wards an incoming message. Consider the router R in FIG.
`
`50
`
`55
`
`60
`
`65
`
`EX.1096.012
`
`DELL
`
`
`
`3
`a phone book, or a word in a dictionary. In those problems,
`we can Search quite fast by first
`Sorting all the words or
`names. Once Sorted, if we are looking for a word Starting
`with Sea, we can simply to the pages of S entries and then
`look for words Starting with Sea etc. Clearly, Such lookup is
`a lot faster than looking up all entries in a dictionary. In fact,
`Such lookup is called exact matching lookup; Standard
`Solutions based on hashing and binary Search provide very
`fast times for exact matching.
`The Internet lookup problem is more difficult than dic
`tionary Search because Internet routerS Store address prefixes
`in their forwarding tables to reduce the size of their tables.
`However, the use of Such address prefixes makes the lookup
`problem one of the longest matching prefix instead of exact
`matching. The longest matching prefix problem is a lot
`harder than the exact matching problem. Before we explain
`why, let uS digreSS briefly and explain why routerS Store
`prefixes in their tables.
`Consider FIG. 3. The situation is similar to that in FIG. 1.
`However, we show the geographic significance of the
`address more clearly. Router R has link L1 to get to Boston
`as before, but Boston is also the “hub” for the whole of the
`U.S. ASSume that we can get to any destination in the U.S.
`from a hub router in Boston. AS before link L3 leads to
`California, from where a message can be sent directly to any
`location in California. Finally, as before, link L2 leads
`directly to San Francisco.
`If we were to use the naive database in FIG. 2, we would
`have to list every destination in the U.S. (possibly
`thousands) in the database. For example, we might list
`Denver, Kans., and other cities as being reachable through
`Boston on link L1. This would lead to an enormously large
`table in router R, which would be difficult to store and
`maintain.
`Instead, we prefer to Store prefixes in the modified data
`base of FIG.1. Notice that we now store all the destinations
`Such as Denver, Kans. by the Single entry USA. (anything
`in the USA). We store California as USA.CA.* (anything in
`California), and San Francisco as USA.CASF. Thus we
`have used only three entries to Store the same amount of
`information. Of course, to make this work we have to
`modify the destination address in a message from Say San
`Francisco (see FIG. 2) to say USA.CASF. But this is easily
`done.
`The use of prefixes introduces a new dimension to the
`lookup problem: multiple prefixes may match a given
`address. If a packet matches multiple prefixes, it is intuitive
`that the packet should be forwarded corresponding to the
`most specific prefix or longest prefix match. Thus a packet
`addressed to USA.CASF matches the USA*., USA.CA.*,
`and the USA.C.A.SF entries. Intuitively, it should be sent to
`L2 corresponding to the most specific match USA.C.A.SF.
`This is because (see FIG. 3) we have a direct line to San
`Francisco and want to use it in the place of possibly longer
`routing through Boston. Similarly a packet addressed to
`USA.CALA matches the USA and USACA entries.
`Intuitively, it should be sent to L3 corresponding to the most
`specific match USA.C.A.*.
`In Summary, routers obtain massive Savings in table size
`by Summarizing Several address entries by using a single
`prefix entry. Unfortunately, this leads to possibly multiple
`prefixes matching a given address, with the result that
`routerS must Solve a harder problem called best matching
`prefix.
`With this interlude behind us, we can define the Internet
`address lookup problem. First, Internet addresses are Strings
`
`15
`
`25
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`6,011,795
`
`4
`of bits, not words using English characters, as we used above
`for the sake of illustration. Abit is either a 0 or 1. Abit string
`is a sequence of bits like 0101. The length of a bit string is
`the number of bits it contains. Thus the length of bit string
`0101 is 4. Internet addresses come in two flavors. The
`current Internet (sometimes called IPv4, for Internet
`
`Protocol, version 4) uses addresses that
`are bit strings of
`length 32. We often say that IPv4 uses 32 bit addresses. The
`Internet is expected to evolve to a next generation Internet
`(sometimes called IPv6, for Internet
`Protocol, version 6)
`which uses 128 bit addresses. AS we will See, the longer
`length of IPv6 addresses will only compound the problems
`of routers.
`Except for this minor difference (bit strings instead of
`character Strings), the Internet lookup problem is exactly the
`best matching prefix problem described above. To make
`things more concrete, consider the forwarding table of
`Internet address prefixes shown in FIG. 6. Except for the fact
`that we use bit strings (and we have labeled the prefixes for
`convenience), the situation is identical to the table in FIG. 4.
`Now Suppose we have a 32 bit IPv4 destination address
`whose first 6 bits are 10101. Clearly its best matching prefix
`is Prefix P4 though it also matches Prefix P3 and P2. Thus
`any message to Such a destination address should be sent to
`the output link corresponding to P4, which is L2. The naivest
`method to Solve the best matching prefix problem is to Scan
`the entire forwarding table looking for the best matching
`prefix of an address. This would be grossly inefficient for
`large tables. We now describe two standard prior art solu
`tions that attempt to solve the IP matching prefix. The first
`Solution is based on converting the best matching prefix
`problem into an exact match problem. The Second Solution
`is based on using a data structure called a trie. We will see
`that both Solutions examine a destination address one bit at
`a time, and hence can take up to 32 steps for IPv4 (and 128
`for IPv6). This can be too slow. Later, we will describe our
`invention for processing IP addresses in larger Steps (which
`we call strides).
`From now, we will describe all schemes
`with respect to IPv4 (32bit) addresses unless we specifically
`generalize to include IPv6.
`In this standard solution from the prior art, we divide the
`forwarding table into Several (at most 32) separate forward
`ing tables Such that Table i contains all prefixes of length i.
`Thus, if we redraw the forwarding table of FIG. 5 using this
`idea, we get FIG. 6. Notice that prefix 1* is in the Length 1
`table, Prefixes 10* and 01* are in the Length 2 table, and so
`on. We have simply Separated prefixes into Separate tables
`according to prefix length.
`to find the longest prefix
`The idea now is to Start trying
`possible Starting with the longest length prefix table and
`work backwards until we find a prefix table that we get a
`match on. So consider an address A whose first 5 bits are
`10100. Since our longest prefix length is 5, we first try for
`a 5 bit match. We take the first 5 bits of address A (i.e.,
`10100) and use any technique for exact matching to match
`these first 5 bits of address A against any prefix in the Length
`5 database. A good technique to use for this is hashing. Since
`we fail to find a match, we move to the next length table
`(Length 3). This time we take the first 3 bits of address A
`(i.e., 101) and we search the Length 3 Table (see FIG. 6).
`This time we get a match and we see that the best matching
`prefix is 101* and the output link is L2.
`This method can cost up to 32 exact matches (often done
`using hashing in software) for IPv4 and 128 exact matches
`for IPv6. (To see this consider an address that matches a 1
`bit prefix, in a table that contains prefixes of all possible
`
`EX.1096.013
`
`DELL
`
`
`
`S
`lengths). An example of a patent that does this is U.S. Pat.
`No. 549,517 by Mullan et al. This is often too time con
`Suming in Software. Another patent proposes doing all the
`exact matches in parallel using hardware. Each exact match
`is done using a Context Addressable Memory (CAM).
`Unfortunately, the hardware cost of this solution is also
`formidable as we have to use 32 CAMs for IPv4 (128 for
`v6); each CAM is expensive. Other methods have proposed
`pipelining the CAMS instead of doing the Searches in
`parallel. However, this prior art Solution has not been met
`with great commercial acceptance.
`Another prior art Solution is based on the use of tries. A
`trie is a data structure which allows us to Search IP addresses
`a bit at a time, as in FIG. 6, but to do so incrementally. A trie
`is a tree of nodes, each note containing a table of pointers.
`The standard solutions for IPv4 (e.g., the solution used in the
`Berkeley UNIX software) uses binary tries, in which each
`trie node is a table consisting of two pointers.
`An example will explain how tries work. Consider FIG. 7.
`The root node is shown on the top left. Each trie node is a
`table whose topmost entry can contain a prefix. Each table
`also can contain two pointers, each of which points to other
`trie nodes (FIG. 7) or to prefixes. This trie stores
`the same
`table as in FIG. 5. The root node (top left) has two pointers.
`The first pointer, corresponding to the value 0, points to a
`subtrie that contains
`
`
`all prefixes that start
`with 0'. Since
`there is only one Such prefix, i.e., P1, the 0 pointer points
`directly to P1. On the other hand, all five other prefixes begin
`with 1. Thus the 1 pointer in the root node, points to a
`Subtrie that contains the 5 prefixes.
`Each Subtrie is a smaller trie with a smaller number of
`prefixeS. In addition to pointers, each node may also have a
`stored prefix P. Define the path of a trie node N to be the
`Sequence of bits corresponding to the pointers used to reach
`N starting from the root. Thus in FIG. 7, the path of Node
`1 is 1; the path of Node 2 is 10, and the path of Node 3 is
`101. We store a prefix P inside node N if the path of node N
`is equal to prefix P, ignoring the * character. Thus in FIG. 76,
`we see that Node 1 stores P2=1*, Node 2 stores P3=10*, and
`Node 3 stores P4=101*.
`Next, for each node N, the 0 pointer points to a subtrie
`of prefixes that begin with the Path of Node N followed by
`a '0'. The 1 pointer points to all prefixes that begin with the
`Path of Node N followed by a '1'. Thus, Node 1 has a '0'
`pointer that points to a Subtrie that contains all prefixes that
`start with the Path to Node 1 (i.e., 1), in other words all
`prefixes that start with 10. Thus Node 1's 0 pointer points to
`a subtrie that contains P5 (100*), P3 (10*), P4 (101*), and
`P6 (10111*). Node 1's 1 pointer does not point anywhere
`because there is no prefix that Starts with 11.
`Similarly, Node 2's Path is 10. Thus Node 2 has P3 stored
`internally (P3=10*). Also, the 0 pointer of P3 points to all
`prefixes that begin with 100. Since the only such prefix is
`P5=100*, the 0 pointer of Node 2 points to P5. Similarly,
`the 1 pointer in Node 2 points to all prefixes that start with
`101. Finally, Node 3's Path is 101. Thus P4=101* is stored
`at Node 3, and Node 3’s 1 pointer points to all prefixes that
`start with 1011, of which the only such prefix is P6.
`Now consider searching the trie table for the best match
`ing prefix corresponding to an address A whose first 6 bits
`are 101011. We use the bits of an address, starting with the
`leftmost bit, to follow a path through the trie. We always
`begin at the root node. Since the first bit of A is 1, we follow
`the 1 pointer to Node 1. Since Node 1 contains a prefix, P1,
`we remember this as a possible matching prefix. Then, Since
`the second bit of A is 0, we follow the 0 pointer to Node
`
`35
`
`40
`
`45
`
`50
`
`55
`
`60
`
`65
`
`6,011,795
`
`15
`
`25
`
`6
`2. Once again, Since node 2 contains a prefix, P3, we
`remember this as the longest prefix we have seen So far on
`the path.
`Next, since the third bit of A is 1, we follow the 1 pointer
`at Node 2 to Node 3. Once again, since Node 3 contains a
`prefix, P4, we remember P4 as the longest matching prefix
`seen so far. Finally, since the fourth bit of A is a '0', we try
`to follow the 0 pointer at Node 3. At this point our trip
`through the trie stops because there is no 0 pointer at Node
`3. When this happens we conclude that the best matching
`prefix for address A is P4. In actual practice, in every place
`a prefix is stored in FIG. 7, we would also store the
`corresponding output link as in FIG. 6 and FIG. 5. We have
`just omitted output links for the Sake of Simplicity.
`Thus, to find a best match prefix in a trie, we use
`Successive bits of the address to trade a path through the trie,
`Starting from the root, until we fail to find a pointer or we
`end at a prefix. AS we walk through the trie, we remember
`the last prefix we saw at a previous node, if any. When we
`fail to find a pointer, this is the best matching prefix.
`AS a Second example of a Search that ends with a prefix,
`consider Searching for an address B whose first three bits are
`100. We go from the root to Node 1 to Node 2 and then
`follow the 0 pointer to prefix P5 which is the best matching
`prefix.
`The worst case time to walk through a trie path is the
`maximum number of nodes in a trie path. If we have the
`sequence of prefixes 1*, 11, 111*, 111*, etc. then we can
`easily have a trie path equal to the maximum address length
`(32 for IPv4, 128 for IPv6). Thus the time for trie search of
`an address can be as bad as following 32 (or 128 for V6)
`pointers. This is somewhat better than the 32 exact matches
`required in FIG. 6, but it is still slow for real routers. The
`problem is that the following of each pointer requires at least
`one READ of memory. The fastest reads to reasonably
`inexpensive memory take about 0.06 usec. Thus 32 READS
`take about 1.7 usec, which is the fastest that routers can do
`today.
`A description of Tries can be found in the textbook called
`“Fundamental Algorithms, Sorting and Searching, by
`Donald Knuth, Addison Wesley, 1973. A description of a
`particular kind of trie (called a Patricia trie, and which is
`optimized to reduce storage) applied to Internet lookups can
`be found in Keith Sklower. A tree-based routing table for
`berkeley unix. Technical report, University of California,
`Berkeley and in W. Richard Stevens and Gary R. Wright.
`TCP/IP Illustrated, Volume 2 The Implementation. Addison
`Wesley, 1995. H. Wilkinson, G. Varghese and N. Poole,
`Compressed Prefix Matching Database Searching, U.S.
`patent application Ser. No. 07/378,718, December 89. Issued
`in Australia as Patent 620994 describes another variant of
`tries that reduces Storage using a technique called path
`compression. All the existing trie Schemes assume that trie
`Search must be performed 1 bit at a time if the prefixes can
`be of arbitrary length. This greatly Slows down trie Search as
`it requires W memory READS, where W is the size of a
`destination address.
`Trie Search that Searches multiple bits at a time is
`
`described in
`Tong-Bi Pei and Charles
`Zukowski. Putting
`routing tables in silicon. IEEE Network Magazine, January
`1992. However, this work only applies to exact matching
`and not to prefix matching. The work in
`U.S. Pat. No.
`5,414,704 by Spinney applies only
`to exact matching and
`not to prefix matching. Radia Perlman. Interconnections,
`Bridges and Routers. Addison-Wesley, 1992 describes a
`method based on binary Search. Unfortunately binary Search
`
`EX.1096.014
`
`DELL
`
`
`
`7
`to the logarithm of the number of
`takes time proportional
`entries. For a typical router database of 30,000 entries this
`takes 13 READS to memory which is too slow. The work in
`U.S. Pat. No. 5,261,090 applies to range checking which is
`a similar problem but also uses binary Search.
`SUMMARY OF THE INVENTION
`Our new invention addresses the slow lookup times of all
`previous Schemes by reducing the number of distinct prefix
`lengths to be considered. Before we describe out invention,
`we describe why the IP address problem can be considerably
`Simpler, if prefixes do not
`have arbitrary length but have
`lengths that are multiples of Say X, where XZ 1. If this
`assumption is true, we will show that we can Search through
`IP address X bits at a time, which is faster. However, real IP
`prefixes have arbitrary lengths. The essence of our new idea
`is to reduce the problem of searching for arbitrary length IP
`addresses to the simpler problem of search IP addresses that
`are a multiple of X, for XZ 1.
`Suppose we had a forwarding table Such that all prefix
`lengths were multiples of 4. In other words, we only had
`prefix lengths that were either 0, 4, 8, 12, 16, 20, 24, 28, or
`32 for IPv4. Then, if we return to the standard Scheme in
`Section 3.1 it is easy to see that since we only have 8 distinct
`prefix lengths, that we only have to do at most 8 exact
`matches. In general, if the prefix lengths are multiples of X,
`and L is the length of an address, we only need to do L/X
`exact matches. This is a big savings: if x=4, we can go 4
`times faster. In general, if we had only
`n distinct prefix
`lengths, L1, L2,..., L. We only need in exact matches.
`Similarly, in the case of trie Search, if all prefix lengths are
`multiples of Say 4, then we can Stop doing trie Search 1 bit
`at a time, and do trie Search 4 bits at a time All the trie nodes
`will now have 16 pointers, corresponding to all possible
`values of 4 bits. Except for this change, the trie Search
`algorithm remains unchanged. Notice that this works only
`because prefix lengths are multiples of 4, and So we will not
`“miss’ any prefixes by walking the trie in
`larger Strides of 4
`units. In general, if we walk a trie X bits at a time, it is easy
`to see that the worst case time is cut down to L/X, as opposed
`to L. (Recall that L is the length of the address).
`The fact that restricted prefix lengths leads to a faster
`search is well known. For example, OSI (another world
`Standard routing protocol that is still popular in parts of the
`world) uses prefixes just as in IP, but the prefix lengths are
`always multiples of 4. It was well known that trie search of
`OSI addresses could proceed in strides of length 4, but only
`in those instances where the prefix was controlled to be a
`certain length, and we are not aware of any prior art Scheme
`for adapting arbitrary length prefixes to prefixes whose
`lengths belong to a Set of preselected lengths.
`We now come to the present invention. We just saw that
`if we could restrict IP prefix lengths we could get faster
`search. But it is well known that real IP prefix lengths are
`nearly arbitrary. Typical backbone routers contain all prefix
`lengths from 2 to 32! Thus prefix lengths are almost arbi
`trary. A central featu