Vector, or data string, matching has many applications in algorithms, database management, data mining, and other operations requiring matching of a string of data. More specifically, a very common application of string matching is a data network, which seeks matching addresses for forwarding operations in a network, such as a local area network (LAN) or the Internet. Data communication networks utilize addresses to forward packets of information between users. As data communication networks continue to evolve, the length of addresses supported, the quantity of traffic, and the data rate with which the traffic is traveling are all increasing. Consequently, routers in such communications networks must determine sources and destinations of endstations associated with traffic more quickly than in the past. For example, Internet Protocol version 4 (“IPv4”), which uses 32-bit addresses and is still in use today, has evolved to the more recent IPv6, which uses 128-bit addresses. Explained differently, IPv6 has about 7.9×1028 times more addresses as IPv4. With an increase in the quantity of addresses, a commensurate increase in the size of memory is needed to hold all those addresses. Furthermore, if throughput rates are to be maintained, address lookups in the vastly larger memory block may take longer.
Routing data packets in Internet Protocol (IP) networks requires a determination of the best matching prefix corresponding to the source and destination addresses for the packet. This process is also referred to as determining a longest prefix match (LPM) for an address. Routers that forward packets typically include a database that stores a number of address prefixes and their associated forwarding decisions (a next address) that indicate where the data should be sent next (next hop). When the router receives a packet it must determine which of the addresses in the database is the best match for the packet based on the longest prefix match (which corresponds to the longest string of digits from the left side of the number moving towards the right side, which represent a more specific address location).
Parsing a long 32-bit or 128-bit address into multiple strides having multiple bits in each stride allows an address to be searched in chunks. The smaller the strides, the more discrete the mapping of the addresses is for a given stride. For example, if searching in 4-bit strides, a very small block of memory is used to store the 24=16 memory locations for a 4-bit stride. If some of the memory blocks have no associated data (i.e., a forwarding address), then the memory block is bypassed, and can be repurposed to conserve memory bandwidth. However, the tradeoff for saving memory using this procedure is the high latency needed for the thirty-two sequential instances of 4-bit strides for spanning a 128-bit address. Parsing into longer bit strides reduces the quantity of sequential strides, but does not allow for tailoring the memory to take advantage of missing or duplicative entries.
Search instructions and algorithms can be used in a linear and unidirectional way, such as searching for increasing lengths of matching strings searching a LPM. Referring to FIG. 1A, a basic binary trie 10 (pronounced ‘try’) is illustrated for conducting a search with successively matching bits in a bit string, or vector. For example, bit string, or address, ‘0001’ for node C has no data, while bit string ‘1101’ for node D does have data. To determine this result for ‘0001’, for example, a search would start at the top of the diagram, move left three successive times for the first three bits ‘000’, and then move right to arrive at the location of ‘0001’, which in this case does not have an associated data. Consequently, a LPM would be ‘00’, which has a darkened circle representing, for a network routing application, a forwarding address. Referring to FIG. 1B, a Patricia trie 11 is shown, which compacts a search by moving a data point upward if there is no decision to be made. For example, data at node A is moved up to node B, because C has no associated data, thereby making distinguishing of bits after node B irrelevant.
Search instructions and algorithms can also be used in a circular or multi-directional manner. For example, FIG. 1B illustrates a cycle graph 12 that can have a path that is linear and open, i.e. A-C-E, a closed path with a repeated vertexes, i.e., B-F-C-E-F-D-B, and a cycle with no repeated edge or vertexes, i.e., B-F-D-B. The specific choice depends on an application, and what addresses are used as the next hot address. Regardless of the application, be it in a linear Patricia trie, or a cycle graph, both can benefit from improvements in data matching.
If prior attempted solutions scaled into multiple arrays having a fixed length, then that is mathematically determinate. However, that is not helpful when the depths of nested arrays have a variable length.