1. Technical Field
Embodiments of the invention relate generally to information processing and more particularly to classifying requests.
2. Prior Art
Information processing in network applications involves receiving requests and classifying them into different categories. Classification details are then used to take various actions including policy enforcement, access control, statistics collection and aggregation. A request from a user may include a Uniform Resource Identifier (URI) expression. URI expression is a uniform syntax of string of characters used to identify a resource. This identification enables interaction with representations of the resource over the network using specific protocols. URI expressions are defined in schemes defining a specific syntax and associated protocols. A URI expression may be classified as a Uniform Resource locator (URL) or a Uniform Resource Name (URN) or both. The complete address of a resource or files including the protocol, the domain and the name of the file constitute the URL. Persistent and location-independent resource identifiers constitute the URN.
URI expressions are de-referenced by clients or users as requests to retrieve a representation of resource which is identified by the URI expression. A request from a user includes the URI expression, and a collection of transport parameters. A repository of such URI expressions act as a server, which examines the URI expression and other transport parameters of the requested resource and provides the resource as a response. Often a server is designed to serve multiple resources, and it is necessary for the server to identify the correct representation of the resource based on the URI expression and other transport parameters of the request. URI classification is helpful for the server to identify the correct representation of the resource.
Further, URI classification is necessary to expedite the information retrieval by referencing the URI expressions of resources. The URI classification helps to support a given public namespace under URI allocation. URI classification also facilitates the representation of the public namespaces within the URI allocation. An URI expression may be represented as, “http://.example.com/abc/URI#Examples_of_URI_references”. In the request, “http” identifies transport scheme, “example.com” is the host part or domain name, “/abc/URI” a path pointing to the resource or article, and “#Examples_of_URI_references” is a fragment pointing to specific parts of the resource or article.
A conventional method for URI classification is pattern matching with URI strings and other request parameters in the URI expression. Pattern matching matches the URI strings for the presence of the constituents of a given pattern. Pattern matching is used to check for a desired structure, to find relevant structure, and to retrieve the aligning parts. Pattern matching can be optimized in several ways, for example, partial string matching and regular expression matching. Further, a unique number representation (hash) of URI strings can be used to categorize the URI expressions. Various algorithms are known for translating pattern matching into conditional expressions.
However, pattern matching approach becomes inefficient in terms of flexible URI classification. The inefficiency arises because of the fact that the pattern based classifications cannot take advantage of structure of URI syntax. Further, in pattern matching approach, inefficient regular expressions are used to match different sub-patterns simultaneously. A sub-pattern match is matching of the patterns between different systems. Moreover, there can be additional information available in addition to URI request, for example, additional information supplied by user as a part of request (typically a transport header, user's name, and user's id) which is not favorable for URI classification in pattern matching approach.
In light of the foregoing discussions, there is a need for a flexible and efficient solution for matching arbitrary complicated patterns in URI syntax. Further, the scope of classification to enable consideration of additional information has to be extended.