In 2011 Cisco Systems 5th Annual Visual Networking Index Forecast the networking giant forecast that global Internet traffic will reach approximately 1 zettabyte a year by 2015 (966×1018). Breaking this down this equates to approximately 80 exabytes per month in 2015, up from approximately 20 exabytes per month in 2010, or 245 terabytes per second. This global Internet traffic will arise from approximately 3 billion people using approximately 6 billion portable and fixed electronic devices. At the same time average broadband access speeds will have increased to nearly 30 megabits per second from the approximately 7 megabits per second in 2011. This Internet traffic and consumer driven broadband access being supported through a variety of local, metropolitan, and wide area networks together with long haul, trunk, submarine, and backbone networks operating at OC-48 (2.5 Gb/s), OC-192 (10 Gb/s), OC-768 (40 Gb/s) with coarse and dense wavelength division multiplexing to provide overall channel capacities in some instances in excess of 1 Tb/s.
Dispersed between and within these networks are Internet routers which have become key to the Internet backbone. Such routers including, but not limited to, edge routers, subscriber edge routers, inter-provider border routers, and core routers. Accordingly, the overall data handled by these Internet routers by 2015 will be many times the approximately 1 zettabyte actually provided to users who will have expectations not only of high access speeds but low latency. Accordingly, Internet routers require fast IP-lookup operations utilizing hundred thousands of entries or more. Each router forwarding received packets toward their final destinations based upon a Longest Prefix Matching (LPM) algorithm to select an entry from a routing table that determines the closest location to the final packet destination among several candidates. As an entry in a routing table may be specify a network, one destination address may match more than one routing table entry. The most specific table entry, the one with the highest subnet mask, being called the longest prefix match. With the length of a packet being up to 32 bits for Internet Protocol version 4 (IPv4) and 144 bits for Internet Protocol version 6 (IPv6) it is evident that even at OC-48 (2.5 Gb/s) with maximum length IPv6 packets over 17 million packets are received per second. These packets containing binary strings and wildcards.
The hardware of the LPM has been designed within the prior art using several approaches including, but not limited to:                Ternary Content-Addressable Memory (TCAM), see for example Gamache et. al. in “A fast ternary CAM design for IP networking applications” (Proc. 12th IEEE ICCCN, pp. 434-439), Noda et. al. in “A cost-efficient high-performance dynamic TCAM with pipelined hierarchical searching and shift redundancy architecture” (IEEE JSSC, Vol. 40, No. 1, pp. 245-253), Maurya et. al. in “A dynamic longest prefix matching content addressable memory for IP routing” (IEEE TVLSI, Vol. 19, No. 6, pp. 963-972), and Kuroda et. al. “A 200 Msps, 0.6 W eDRAM-based search engine applying full-route capacity dedicated FIB application” (Proc. CICC 2012, pp. 1-4);        Trie-based schemes, see for example, Eatherton et al. “Tree bitmap: hardware/software IP lookups with incremental updates” (SIGCOMM Comput. Commun. Rev., Vol. 34, No. 2, pp. 97-122), and Bando et al “Flashtrie: Beyond 100-Gb/s IP route lookup using hash-based prefix-compressed trie” (IEEE/ACM Trans. Networking, Vol. 20, No. 4, pp. 1262-1275); and        Hash-based schemes, see for example Hasan et. al. in “Chisel: A storage-efficient, collision-free hash-based network processing architecture” (Proc. 33rd ISCA, pp. 203-215, June 2006) and Dharmapurikar et al. in “Longest prefix matching using bloom filters” (IEEE/ACM Trans. Networking 2006, Vol. 14, No. 2, pp. 397-409).        
Unlike random access memory (RAM) which RAM returns the data word stored at a supplied memory address a Content Addressable Memory (CAM) searches its entire memory to see if a data word supplied to it is stored anywhere within it. If the data word is found, the CAM returns a list of the one or more storage addresses where the word was found. A Ternary Content Addressable Memory (TCAM) allows a third matching state of “X” (or “Don't Care”) in addition to “0” and “1” for one or more of the bits within the stored data word, thus adding flexibility to the search. Beneficially TCAMs perform the search of all entries stored in the TCAM cells in parallel and allow therefore for high-speed lookup operations. However, the large area of the cell, exploiting 16 transistors versus the 6 transistors in a static RAM (SRAM) cell and the brute-force searching methodology result in large power dissipation and inefficient hardware architectures for large forwarding tables. In contrast trie-based schemes exploit ordered tree data structures to store prefixes and locations based on this binary-tree structure that is created based on portions of stored Internet Protocol (IP) addresses. Searching is performed by traversing the tree until an LPM is found and may be implemented in hardware using SRAMs, rather than TCAMs, which potentially lowers power dissipation. However, deep trees require multi-step lookups slowing the determination of the LPM. Hash-based schemes use one or more hash tables to store prefixes where the benefit is scalability as table size is increased with length-independent searching speed. However, hash-based schemes have a possibility of collisions that requires post-processing to decide on only one output and requires reading many hash tables for each length of stored strings thereby slowing the process.
According, it would be evident that prior art solutions to LPM lookup offer different tradeoffs and that it would be beneficial for a design methodology that provides for low power large scale IP lookup engines addressing the limitations within the prior art. With carriers looking to add picocells, for example with ranges of a few hundred meters, to augment microcells and base stations in order to address capacity demands in dense urban environments for example power consumption becomes an important factor against conventional IP router deployment scenarios. According to embodiments of the invention a low-power large-scale IP lookup engine may be implemented exploiting clustered neural networks (CNNs). In addition to reduced power consumption embodiments of the invention provide reduced transistor count providing for reduced semiconductor die footprints and hence reduced die cost.
Beneficially low cost TCAMs would allow for their deployment within a variety of other applications where to date they have not been feasible due to cost as well as others where their deployment had not been previously considered. For example, TCAMs would enable routers to perform additional functions beyond address lookups, including, but not limited to, virus detection and intrusion detection.
Other aspects and features of the present invention will become apparent to those ordinarily skilled in the art upon review of the following description of specific embodiments of the invention in conjunction with the accompanying figures.