This invention relates to prefix matching in database searches in general, and more particularly to high-speed routing lookups of IP messages traveling across the Internet.
Traffic on the Internet is increasing exponentially. Traffic increase can be traced not only to the increased number of hosts, but also to new applications (e.g., the Web, video conferencing, remote imaging), which have higher bandwidth needs than traditional applications.
One can only expect further increases in users, computers, applications and traffic. The possibility of a global Internet with multiple addresses per user (e.g., each user may have several appliances on the Internet) has necessitated a transition from the older Internet routing protocol (called IPv4, with small 32 bit addresses) to the proposed next generation protocol (called IPv6, with much larger 128 bit addresses).
The increasing traffic demand placed on the network forces two key factors to keep pace: first, the speed of communication links; and second, the rate at which routers can forward messages.
Routers are computers that route packets in the Internet according to the destination address of the packet that is placed in a portion of the packet called a header, very much like automated Post Offices in the postal network.
With the advent of fiber optic links, it is easily and economically possible to increase the speed of communication links. However, improving the speed of communication links is insufficient unless route's forwarding speeds increase proportionately, thus router vendors wish to increase the forwarding performance of their routers.
The two main functions of a router are to lookup destination addresses (address lookup) of the packet and then to send the packet to the right output link (message switching). Fortunately, the problem of message switching is very well understood in recent years because of advances in asynchronous transfer mode (ATM) switching technology, and economical gigabit message switching is quite feasible today. Thus, the major problem that routers face in forwarding an Internet message is known as address lookup.
When a message, carrying its destination address arrives on a certain link of a router, the router consults a Forwarding Table (sometimes also called a Forwarding Database). This is a table in the memory of the router, which lists possible destination addresses and their corresponding output link.
However, it is impossible for each router to store routing information for every possible address on the network. Rather, routers store routing information for partial addresses. Consequently, the packet advances toward its destination by “hopping” from a router to a router until it reaches its final destination.
Internet address consists of a string of bits. The IPv4 uses 32 bit addresses while the expected IPv6 will use 128 bit addresses. A string of bits identical to a partial sequence of the bits of the address, beginning with the first bit of the address, which resides in the entry of the Forwarding Table, is called a prefix.
Routers obtain massive savings in table size by summarizing several address entries by using a single prefix entry. Unfortunately, the use of prefixes introduces a new dimension to the lookup problem: multiple prefixes with various lengths may match a given address.
If a packet matches multiple prefixes, it is intuitive clear that the packet should be forwarded corresponding to the most specific (longest) prefix that it matches.
When a message arrives, the router must search through its database and retrieve the entry corresponding to the best (longest) matching prefix (BMP) of the destination address and to forward the message accordingly toward its final destination via a router which is “closer” to the destination address.
Current speeds for lookups are quite slow. For example, by using existing routers which utilize hardware assistance for lookups, it can take for a single lookup up to 1 μs on average and 3 μs in the worst case.
The longer length of IPv6 addresses as compared to the IPv4 addresses, will only compound the address lookup problems of routers.
It is therefore obvious that speeding up address lookups is a very hot topic that has received considerable attention in recent years. Three major directions were taken:
(1) Better implementations of the data structures and search techniques in the router, mostly software based. (2) Hardware approaches to enable fast lookups with parallelism in the hardware, and (3) Avoiding the lookup process by adding indexing keys, such as labels, and flow identifiers in the packet headers. Being prior art, these approaches will be briefly described:
Data Structures and Algorithms Approach:
This approach is treated in the following references:    K. Sklower. “A tree-based routing table for berkeley unix”. Technical report, 1992.    R. Perlman. “Interconnections, Bridges and Routers”. Addison-Wesley, 1992.    M. Waldvogel, G. Varghese, J. Turner, and B. Plattner. “Scalable high speed IP routing lookups”. In Proc. ACNI SIGCOMM 97, Octeber 1997.    Degermark, A. Brodnik, S. Carlsson, and S. Pink. “Small forwarding table for fast routing lookups”. In Proc. ACM SIGCOMM 97, Octeber 1997.    B. Lampson, V. Srinivasan, and G. Varghese. “IP lookups using multi way and multi-columm search”. In Proc. Infocom 98, March 1998.
The standard IP lookup algorithm currently in use is based on radix trie (or Patricia), see Leffler and Samuel J. et al. “The Design and implementation of the 4.3BSD UNIX” Addison-Wesley, 1988, and K. Sklower. 1992.
In this implementation the prefixes are efficiently represented in a trie. A trie is a data structure, that allows us to search for prefixes a bit at a time and to do so incrementally.
A trie consist of a tree of nodes, each node containing a table of pointers. The standard solution for the Ipv4 (e.g., the solution which used in BSD UNIX) uses binary tries, in which each trie node is a table consisting of two pointers.
Scanning the address bit by bit and matching it along a path in the trie perform each address lookup. The worse case cost of an IP lookup is thus O(W) where W is the address length (32 in IPv4, 128 in IPv6). This scheme requires O(N) space, where N is the total number of prefixes in the forwarding table.
The basic approaches to improve this scheme are: (1) Perform a binary search over the possible prefix lengths, see M. Waldvogel et al., 1997. For each test in the binary search a hash table is consulted, requiring to break the prefixes into several hash tables which all together require O(N log W) space.
(2) Go over the address in different jumps, rather then bit by bit, see V. Srinivasan and G. Varghese. “Faster IP lookups using controlled prefix expansion”. In Proc. ACM Sigmetrics 98, June 1998.
(3) Binary search over the space of N prefixes, requiring O(log 2N), see R. Perlman, 1992. This approach has been improved by relying on the synchronous dynamic random access memory (SDRAM) technology and performing 6-way search resulting in O(log N) steps, see B. Lampson et al., 1998.
(4) Compress the prefixes data structure into the cache, see Degermark et al., 1997, and S. Nilsson and G. Karlsson. “Fast address look-up for Internet routers”. In Proc. IEEE Broadband Communications98, April 1998.
Patent literature with regard to the increasing speed of IP routing lookups includes: U.S. Pat. No. 6,011,795 to Varghese, et al., U.S. Pat. No. 6,014,659 to Wilkinson III, et al. and U.S. Pat. No. 6,018,524 to Turner, et al.
Hardware Approach:
This approach is shown in the following references:    A. J. McAuley and P. Francis. “Fast routing table lookup using cams”. In Proc. INFOCOM, pages 1382-1391, March-April 1993.    A. J. McAuley, P. F. Tsuchiya, and D. V. Wilson. “Fast multilevel hierarchical routing table using content-addressable memory”. In Proc. SIGCOMM 95. January 1995, and in U.S. Pat. No. 5,3816,423 to McAuley.
There are several directions all based on the usage parallelism in the hardware level:                (1) Usage of pipelining to perform several lookups at the same time, see P. Gupta, S. Lin, and N. McKeown. “Routing lookups in hardware at memory access speeds”. In Proc. INFOCOM, April 1998.        (2) Employ low level hardware parallelism by using content addressable memories (CAMs), see A. J. McAuley et al., 1993 & 1995. In such memories, (like associative memories) the address is compared against all the prefixes in the memory in parallel.        (3) Employing a cache to hold the results of recent lookups. It is possible to achieve a 90% hit rate only by employing a large and very expensive cache based on the CAM technology. See C. Partridge. “Locality and route caches.” In NFS Workshop on Internet Statistics Measurement and Analysis, February 1996, and P. Newan, G. Minshall, and L. Huston. “IP switching and gigabit routers.” In IEEE Communications Magazine, January-1997.        
All the hardware solutions suffer from very high costs especially when applied to large backbone routers, and they don't scale easily.
Label Swapping Approach
Motivated by the increased demand for Gigabit routers to carry the ever-growing IP Traffic; there have been recently several suggestions to combine fast packet switching/processing with IP routing. Specifically it is suggested to exploit the cheaper price of bandwidth compared with the price of processing.
The idea is to add some information to the packet header, which helps the routers along the packet path to process the packet, i.e., perform IP lookups much faster.
This direction includes IP-switching. See Newan et al., 1997. TAG-switching, Yakov Rekhter et al., “Tag switching architecture overview”. Technical report, IETF, 1996, ftp://ds.internic.net/internet-drafts/draft-rfced-info-rekhter-00.txt, threaded indices, see G. Chandranmenon and G. Varghese. “Trading packet headers for packet processing”. In IEEE Transactions on Networking, April 1996, and Multiple Protocol Label Switching (MPLS), see R. Callon, P. Doolan, N. Feldman, A. Fredette, and G. Swallow. “A framwork for multiprotocol label switching”. Technical report, IETF, November 1997. draft-ietf mpls-framework-02.txt.
Basically, a label is attached to each packet in a flow. Routing decisions are done by one memory reference into a table of labels (similar to virtual circuit (VC) switching in ATM). Each entry in the table contains for the corresponding label, its routing decision and perhaps a new label to attach to the packet.
The main issue in the label swapping methods is how to associate a label to a flow, when is this association made, and whether it may it be aggregated.
Two basic approaches are: traffic based label assignments, and topology based label assignments. In traffic based label assignment, each flow of packets receives a label, similar to VC routing in ATM.
This method introduces setup overhead that delays the first packet of a flow by either a complete round trip or by just one hop in a more sophisticated implementation. In the topology based approach, a label is assigned to each destination or group of destinations (another much more expensive possibility is to assign a label for each source destination pair, like private virtual circuit (PVC) in ATM).
Either of the label approaches does not completely eliminate the need for a full IP lookup. When packets are transferred between different networks (networks that are owned by different companies) an IP Lookup is required to compute new labels in order to resolve label coordination problems.
Both switching methods require additional coordination and communication between routers to distribute and agree on the labels. These methods require a major change in the router protocol and work only in those portions of the network that have implemented them.
Since the number of labels is bounded it is impossible to assign each destination or each flow its own label. Thus, in TAG switching for example, a label is given to a group of destinations and when the packets approach the destination they need to be separated, which requires again a full IP lookup.
According to the prior art which was described above, it is understood that IP address lookup can not be avoided. It is therefore a widely recognized need for an algorithm, that quickly looks up at an IP address and forwards the message to its destination, which would overcome the disadvantages of presently known lookup methods as described above.