The searching of a knowledge base to determine a match or conflict for a given search object is an important requirement for many different types of applications. For example, the following applications rely heavily on the performance of speedy searches: data base retrieval; expert systems; robotic and state control strategy; signal recognition, including for example speech and image recognition; communications, including for example data compression and protocol processing for bridging, routing and switching applications; natural language cognitive systems; modeling operations; parsers; and compilers. Thus, a great deal of work has been done in an attempt to find better ways of performing searches.
The effectiveness of a searching scheme can be evaluated based upon three separate, but related, characteristics. First, the cost of the scheme in terms of memory required to implement the knowledge base should be minimized. Second, the speed of the searching scheme, as determined by the worst case time required to complete a search, should be maximized. Generally, searching schemes are implemented in a plurality of steps or cycles that each take a predetermined amount of time to complete. Thus, the maximum time to complete a search is generally reduced by minimizing the worst case number of required steps or cycles. Finally, the cost associated with updating the knowledge base to add or delete entries should be minimized. Typically, when a knowledge base is being updated, it cannot simultaneously be used to perform searching. Since the efficiency of any searching scheme is based on the number of searches it can perform in a given amount of time, time spent maintaining the knowledge base should be minimized.
For many applications, searches are performed on search objects having a large number of bits. An example of such an application is a bridge in a communication system. A typical communication system includes a number of devices that communicate over a plurality of connections. Typically, the system is organized into a plurality of local connections with only a limited number of devices being associated with each local connection. A network of bridges is used to interconnect the local connections so that each device can communicate with other devices that are not associated with the same local connection. Each local connection has an associated bridge that monitors traffic on the local connection, as well as external traffic (i.e., traffic on any other connection in the system). When a bridge determines that external traffic is being sourced to a device on its local connection, the bridge allows the information to pass through to the local connection. Similarly, when information is sourced from the local connection to an external destination, the bridge allows the information to pass from the local connection. As a result, the local connection is not congested with external traffic.
Typically, devices communicate by moving packets of information from a source to destination across the system. A packet typically includes bits identifying the addresses, which can be on the order of forty-eight bits, of the packet's source and destination devices. Each bridge monitors traffic locally and externally by maintaining a knowledge base with an entry for each of the devices on its associated local connection. Thus, one function performed by a bridge is to receive externally sourced packets, and perform a search of its knowledge base to determine whether the 48-bit destination address matches any of the devices located on its local connection. Thus, the search object (i.e., the destination address) could have a value equaling any of 2.sup.48 or 280 trillion possible addresses. However, the number of entries in the bridge's knowledge base will be equal only to the number of devices on its associated local connection, and therefore, will be significantly less than 280 trillion.
As seen from the foregoing, to implement the requirements of the bridging application, a searching scheme needs to operate on a search object that can have any of a huge number of possible values, but must search a number of knowledge base entries that is significantly less. This characteristic is typical of many searching applications. However, prior art searching techniques are not particularly well adapted to implement such searches.
An example of a technique for implementing a searching scheme is to use direct memory access. Under such a scheme, a memory is provided that has the capability of storing a number of entries equaling the full set of possible values that the search object could have. Thus, for the example described above, the 48-bit destination address would be used to directly address a knowledge base memory having a 48-bit address. Such a searching scheme would provide extremely fast searching and updating wherein each could be performed in one step or cycle because the 48-bit address could directly access each entry in the memory. However, for applications wherein the search object includes a large number of bits, the cost in memory to support the full set of possible search values may be so large as to make such a searching scheme impractical. For example, to use direct memory access to implement the bridging application described above, the knowledge base memory would need to include 280 trillion (i.e., 2.sup.48) entries. Thus, because of the large amount of memory required, direct memory access is not practical for use in searching applications wherein the search object includes a large number of bits.
Other prior art searching techniques have been developed wherein the size of the knowledge base is bounded by the maximum number of entries stored at one time. Informed searches are an example of such a searching technique. Informed searches are implemented using a plurality of look-ups of the knowledge base memory, with the result of each look-up being used to determine the memory address for the next look-up. A binary search is an example of an informed search. In a binary search system, all of the search entries stored in the knowledge base are sorted by address. To perform a search for a given search object, the object is first compared with the entry stored in the middle of the knowledge base. If the search object does not match the middle entry, an indication is provided as to whether the value of the search object is higher or lower, such that the determination of which entry to compare next is an informed one. The search object is then compared with the entry stored between the middle entry and either the upper or lower entry. In this manner, the searching scheme continues to divide the remaining entries in half until a match or miss is determined.
Binary searching provides an advantage over the direct memory access scheme because the knowledge base need only include sufficient memory to store the worst case number of entries actually being searched at one time, rather than having sufficient memory to store an entry corresponding to every value that the search object could possibly have. However, a performance price is paid for both search speed, and updating the knowledge base to add/delete entries. In a binary search system, the relative position of the entries in the knowledge base is important because that is what enables the search to be informed. Thus, when the knowledge base is updated with an entry having a relatively low value, it is possible that each of the entries in the knowledge base would need to be updated to readjust their positions. Thus, the cost of updating the knowledge base is a function of, and in the worst case is equal to, the number of entries stored. Additionally, the worst case time required to perform a binary search is also greater than that for the direct memory access scheme. Rather than performing every search in one step or cycle, the worst case number of cycles required to perform a binary search is a function of the number N of entries in the knowledge base, and is equal to log.sub.2 N.
In an attempt to increase searching speed, some searching schemes have incorporated hashing techniques wherein the knowledge base entries are stored in hashing tables. Hashing tables use a non one-to-one mapping function in an attempt to improve the probability of finding an early match. However, because non one-to-one mapping is used, more than one search entry can map to a single address, creating a hashing collision. Thus, hashing schemes must include some mechanism for resolving collisions. Since the resolution of hashing collisions generally requires additional look-ups, collisions can greatly reduce the speed at which searches of a hashing table can be performed.
In view of the foregoing, it is an object of the present invention to provide an improved method and apparatus for performing searches.