Various packet switching architectures are known for interconnecting and exchanging data among computers. Of these, the Internet is the most popular. The Internet, short for Internetwork, is global in scope and connects networks and computers across businesses, organizations and residences around the world. Intranets have similar architectures but their scope is restricted to single businesses and organizations. Typically, messages sent from one host computer to another travel through a number of routers that form the connection between source and destination computers. Each computer of the Internet, or an Intranet, is assigned an identifying string of bits called an address. The address of an intended destination host computer is added to a packet by a source computer as the packet is being sent. The process of packet delivery is referred to as routing and the equipment that performs this task is a router. A router sorts out incoming packets and sends them toward the respective destinations. Each router must contain a current topological layout of the Internet which indicates which paths to take to reach any host computer. Routers periodically exchange topological information using signaling protocols that are known as routing protocols. In this way, each router is updated with respect to Internet additions and deletions.
In order for routers to relay messages toward their proper destinations, each host machine must have a unique identification address. Due to the high number of interconnected computers, over forty million as of 1995, it was deemed prudent in the early years of the Internet to introduce a hierarchy of address strings. This is the essence of the Internet Protocol. Under the Internet Protocol, the most significant bits of an address string designate a network and the remaining portion is reserved for identifying host computers that are connected to the designated network. It was also realized that a further hierarchy might be desired by large organizations, so the host computer part of the string is further subdivided into a subnet address and host identification number. For a given network address, the corresponding organization chooses the number of subnets and the size of the subnetwork field, giving rise to many different ways of partitioning addresses. A thirty-two bit address, therefore, might designate a network, a subnetwork, and a host computer. Note that as the bits are processed from most significant to least significant, the granularity of the addressed entity becomes finer. In other words, a set of most significant bits addresses a network, a set of next most significant bits designates a subnetwork, and the remaining set of bits denotes a machine. The network field could range from eight to twenty-four bits, depending on the class of network. A class A network is a large network and has a field of eight bits. A class B network is smaller and has a field of sixteen bits, while a class C network is smaller yet and has a field of twenty-four bits. Under the Internet Protocol, the length of a network field is determined from the most significant bits of the address. If the first bit of the address is a "0", a class A network is designated; if the first two bits are "10", a class B network was indicated; and if the first three bits are "110", a class C network is indicated. The subnetwork and host computer fields are contained in the remainder of the thirty-two bits. As indicated above, the choice of the size of the subnet field is dependent on the organization rather than universally accepted boundaries. The conventional method for determining the end of the subnetwork field and the beginning of the host field is to use a bit mask and a masking function.
The routing of a received packet is based on the accompanying address string. The address string is used as a search key in a database which contains the address string along with other pertinent details such as which router is next in the delivery of the packet. The database is referred to as a routing table, while the link between the current router and the next router is called the next hop in the progress of the packet. The routing table search process depends on the structure of the address as well as how the table is organized. As an example, a search key of a size less than eight bits and having a nonhierarchical structure would most efficiently be found in a routing table organized as a series of address entries. The search key would be used as an index in the table to locate the right entry. For a search key of a larger size, say thirty-two bits, the corresponding routing table may have more than ten thousand entries. Organizing the database as a simple table to be searched directly by an index would waste a large amount of memory space, because most of the table would be empty.
Conventional routers break up the search process into three steps. The first step is to determine whether the router is directly connected to the destination host computer. If this is the case, the message is one hop from the destination and should be routed in that direction. If the destination computer is not directly connected to the router, the second step is to determine in which topological direction the destination network lies. If the direction is determined from the topological layout, the message should be routed that way. If it is not determined, the third step is to route the message along a default link.
Typically, the first step is performed using a linear search of a table containing the thirty-two bit addresses of host computers directly connected to the router. Reflecting the local topology, each entry in the address table is connected to a corresponding output interface leading directly to the addressed computer. When a destination address is received by a router, the full thirty-two bits are compared with each of the destination addresses in the table. If a match is made, the message is sent directly to the corresponding destination via the specified router interface. If no match is made, the second step of the routing procedure is taken.
The second step, determining the direction of the destination network, is not usually performed by a linear search method through a table since the number of network addresses would make such a table unwieldy. In the days when address strings conformed to the three-level hierarchy of network address, subnet address and host identification, routers performed the determination using one of several known techniques, such as hashing, Patricia-tree searching, and multilevel searching. In hashing, a hash function reduces the network portion of the address, producing a small, manageable index. The hash index is used to index a hash table and search for a matching hash entry. Corresponding to each hash entry of the hash table is the address of an output interface pointing in the topological direction of a corresponding network. If a match is found between the hashed network portion and a hash entry, the message is directed towards the corresponding interface and destination network.
Hashing reduces a large unwieldy field to a small manageable index. In the process, however, there is a chance that two or more fields may generate the same hash index. This occurrence is referred to as a collision, since these fields must be stored in the same location in the hash table. Further searching is needed to differentiate the entries thus in collision. Collisions, therefore, reduce the efficiency obtained by using the hashing search and in the worst case, where all permissible addresses coalesce to a single index, render hashing practically useless as a search process.
Patricia-tree searching avoids the collisions encountered by hashing methods. The exact search algorithm is complex and need not be fully described here. In short, this method of searching requires that all address strings and accompanying information, such as related route information, be stored in a binary tree. Starting from the most significant bit position within the address string, the search process compares the address, bit by bit, with the tree nodes. A matched bit value guides the search to visit either the left or the right child node and the process is repeated for the next bit of the address. The search time is proportional to the size of the longest address string stored. In Patricia-tree searching, the difference between the average search time and the worst case search time is not very large. Moreover, the routing table is organized quite efficiently. Typically, it requires less memory than comparable routing tables of hashing methods. Patricia-tree searching handles the worst case searches better than the hashing methods, but in most cases it takes significantly longer to locate a match. Thus, many conventional routers use a combination of hashing and Patricia-tree searching. This combination is called multilevel searching.
Multilevel searching joins hashing with Patricia-tree searching. A cache stores a hash table containing a subset of most recently, and presumably most commonly, routed network addresses, while a Patricia tree stores the full set of network addresses. As a message is received, the destination address is hashed onto the table. If it is not located within a predetermined period of time, the address is passed to the Patricia-tree search engine which ensures that the address, if stored, will be found.
The three-level hierarchy of the Internet Protocol was replaced by the Classless Internet Domain Routing (CIDR) protocol. The development of CIDR was necessitated by the explosive growth in the size of the Internet and the near exhaustion of the Class B network addresses. In summary, CIDR specifies how two or more organizations that share a common path to the next hop may share a single routing table entry and thus reduce the total maximum size of a routing table. With CIDR, the search rule is easy to describe, but difficult to implement. The matching of a search key to an address string stored in the routing table requires that the longest number of consecutive bits, starting with the most significant bit, are matched. What makes this matching rule hard to implement is allowing a routing table to contain multiple partial matches. When such is the case, a router is required to use the entry with the longest prefix match for routing the packet correctly. Allowing a partial match rule and finer granularity in defining fields with the address boundary results makes conventional hashing methods inappropriate to use. The Patricia-tree search also becomes more complex and requires backtracking of tree nodes already visited. What is needed is a routing method that quickly and efficiently searches a routing table for addresses using a longest prefix matching rule.