1. Technical Field
The present invention relates in general to pattern match searching for text strings, and in particular to a method and system within a data processing network for parsing text strings such that the efficiency of a pattern match search may be improved. More particularly, the present invention relates to efficiently performing incremental full match searches within a lookup table that cumulatively produce a longest prefix match result.
2. Description of the Related Art
Parsing of text strings is a common processing task requiring significant processor cycles. Within a network environment, an example of such parsing tasks is processing of Universal Resource Identifier (URI) strings. A URI is a compact string of characters for identifying an abstract or physical resource. A URI can be further classified as a locator, a name, or both. A Universal Resource Locator (URL) is a type of URI string that identifies resources via a representation of their primary access mechanism (e.g., their network “location”). URL addresses serve as the global addresses utilized by Web browsers to access documents and other resources on the Internet. As utilized herein, “the Internet” refers to the worldwide collection of networks that utilize the Transmission Control Protocol/Internet Protocol (TCP/IP) suite of protocols to communicate with one another. A URL specifies the protocol to be utilized in accessing a resource (such as http: for a World Wide Web page), the name of the server on which the resource resides (such as //www.ibm.com), and, optionally, the path to a particular resource (such as a hypertext markup language file) on the server. Encoded within each URL address string is the Internet Protocol (IP) address of the destination server.
Parsing of URI character strings, such as URL addresses, is often incorporated within pattern searching algorithms utilized by network processors. Such pattern search algorithms are utilized to find the longest matching binary sequence from a collection of stored binary strings. Specifically, such tasks require comparing an input search key to a data string that is stored in a database to find the longest match. The database that stores the data strings often includes a lookup table that, after establishing a match between an input search key and a data string within the database, either retrieves information or executes a program linked to the data string.
Pattern matching searches are utilized in packet-based communication networks to facilitate routing of packets among multiple interconnected nodes. Specialized nodes called routers are responsible for delivering or “forwarding” a packet to its destination in accordance with an IP destination address. IP currently supports a network routing protocol called IPv4 (Internet Protocol Version 4) that a 32-bit address in the header of each packet. For each packet received through an input link interface, a router reads the address field to determine the identity of the device (such as another router or host) to which the packet should be forwarded before reaching its final destination. Depending on the size of the network and its structure, the packet is either directly forwarded to its destination or sent to another router, very much the same way a letter is passed through several post offices until reaching its final address.
For Internet applications, a network processor determines the IP address of a destination server to which the packet is to be ultimately delivered by decoding a URL address. Network processors handle millions of packets per second, and thus must be capable of processing the URL strings very efficiently. Conventionally, URL strings are processed incrementally one byte at a time using a longest prefix match algorithm. The process continues to iterate as long as more than one stored URL prefix matches the corresponding piece of the URL string from the packet being processed. Once the process has eliminated all but one of the stored URL prefixes, the single remaining prefix is utilized to identify the desired destination address. After determining the optimum destination node, the router encodes the corresponding destination address into the address field of the packet and delivers the packet to a particular output link interface according to the encoded destination address. This method of URL processing lookup has become an increasingly critical delay bottleneck for Internet traffic.
It can therefore be appreciated that a need exists for an improved technique for parsing and processing a URI character string to efficiently determine a unique network resource. The present invention addresses such a need.