The communications industry is rapidly changing to adjust to emerging technologies and ever increasing customer demand. This customer demand for new applications and increased performance of existing applications is driving communications network and system providers to employ networks and systems having greater speed and capacity (e.g., greater bandwidth). In trying to achieve these goals, a common approach taken by many communications providers is to use packet switching technology. Increasingly, public and private communications networks are being built and expanded using various packet technologies, such as Internet Protocol (IP). Note, nothing described or referenced in this document is admitted as prior art to this application unless explicitly so stated.
A network device, such as a switch or router, typically receives, processes, and forwards or discards a packet based on one or more criteria, including the type of protocol used by the packet, addresses of the packet (e.g., source, destination, group), and type or quality of service requested. Additionally, one or more security operations are typically performed on each packet. But before these operations can be performed, a packet classification operation typically including one or more lookup operations must be performed on the packet.
The use of hash functions and hash tables is one approach for performing lookup operations on a set of data to identify a matching item, if one exists. Typically, a hash function is used where the possible values of the data is larger than the size of memory or other storage mechanism desired to use for storing the data, and the actual enumerated data items is sparse relative to the range of possible values.
Standard issues for implementations of lookup mechanisms using hash functions include the time for inserting an item and the maximum time required for looking up an item. Typically, it is important for a lookup mechanism to have a maximum bound on the time required to lookup a data item. These problems typically drive the need for new and different hashing mechanisms and methods. For example, when two data items to be stored are hashed to a same position, a collision occurs, and some mechanism is used for storing the multiple values, such as pre-allocating space for multiple data items in the bucket (i.e., the position in the hash table to which the value hashes) for each hash value in the memory, or using a linked list of items for storing multiple items in a bucket. Implementations that use linked lists that require a memory read operation for each of the linked elements are especially time consuming, and insertion mechanisms that do not bound the number corresponding to a single hashed value may be unacceptable for certain real-time applications.
Many different techniques are known and used for hashing. A prior approach uses a single hash function that generates a fixed random position within the storage space based on the value to be stored and evenly stores the data across the storage space. Multiple items are stored within a bucket such as by a linked list and/or pre-allocated space in each bucket for storing multiple data items. Another prior approach uses a single hash table and function with space allocated in each bucket to store multiple data items.
Another prior approach is to use two independent hash functions with two different hash tables with each bucket having space to store a single item. When inserting a data item, the position to add the data item is determined for each of the different hash functions, and the data item is inserted in the bucket of the two identified buckets containing the smaller number of data items. If both non-full buckets contain the same number of items (i.e., zero or one), then data item is added to the bucket in the “left” has table (i.e., always the same predetermined hash table). If both buckets are full prior to adding the item to the hash table, then the item is stored in a separate data structure and it will be located in a secondary search operation typically performed by a software process, rather than hardware optimized for maintaining and searching hash tables. When searching the two hash tables, two lookup operations are performed in parallel on each of the two hash tables.
Another prior approach, known as cuckoo hashing, uses two hash tables with each bucket having a single entry. Data values are inserted sequentially into the hashing tables. For a given value to be added, it is added to the first hash table at the position according to the first hash function. If the position was empty prior to the addition, then processing is complete. Otherwise, the value previously stored at that location is stored in the second hash table at the position according to the second hash function. If the position was empty prior to the addition, then processing is complete. Otherwise, the value previously stored at that location is stored in the first hash table at the position according to the first hash function, and so on. To bound the insert time, if a maximum number of iterations is exceeded, the hash functions are considered to have failed and different hash functions are used for all the entries. Some drawbacks of this approach include the large number of entries required for each of the hash tables and the worst-case insertion time of an entry. Also, this approach requires each hash table to have a large number of buckets, and the insert time can be long.