1. Field of the Invention
The present invention relates to computer programming, and deals more particularly with techniques for using geographical taxonomy data with spatial extensions (e.g., as extensions to an object-relational database) to facilitate programmatically locating information in network-accessible service registries.
2. Description of the Related Art
Network-accessible service registries are known in the art, and may be queried to programmatically locate registered services. A registry which is currently being deployed is referred to as the Universal Description, Discovery, and Integration (or “UDDI”) registry. The term “UDDI” is also used more generally to refer to the registry and/or to the specification defining the registry and its associated access techniques. As stated on the Internet home page of the standards group defining the UDDI specification, “UDDI is the building block that will enable businesses to quickly, easily and dynamically find and transact business with one another using their preferred applications.”. (See http://www.uddi.org for more information on UDDI. The UDDI specification may be found at http://www.uddi.org/specification.html.)
UDDI registries are designed for use with so-called “Web services” technology. Web services technology is a mechanism which is known in the art for distributed application integration in client/server networks such as the World Wide Web. Many industry experts consider the service-oriented Web services initiative to be the next evolutionary phase of the Internet. With Web services, distributed network access to software will become widely available for program-to-program operation, without requiring intervention from humans. Web services technology is also commonly referred to as the “service-oriented architecture” for distributed computing.
In general, a “Web service” is an interface that describes a collection of network-accessible operations. Web services fulfill a specific task or a set of tasks. They may work with one or more other web services in an interoperable manner to carry out their part of a complex workflow or a business transaction. For example, completing a complex purchase order transactions may require automated interaction between an order placement service (i.e., order placement software) at the ordering business and an order fulfillment service at one or more of its business partners. As another example, when an on-line retailer accepts a customer's order for goods, the order completion process may include programmatically selecting a delivery service to deliver the goods to the customer, such that the customer can be provided with delivery tracking information as part of his order confirmation.
Web services are generally structured using a model in which an enterprise providing network-accessible services publishes the services to a network-accessible registry (referred to herein as a UDDI registry, for purposes of illustration only), and other enterprises needing services are able to query the registry to learn of the services' availability. The participants in this computing model are commonly referred to as (1) service providers, (2) service requesters, and (3) service brokers. These participants, and the fundamental operations involved with exchanging messages between them, are illustrated in FIG. 1. The service providers 100 are the entities having services available, and the registry to which these services are published 110 is maintained by a service broker 120. The service requesters 150 are the entities needing services and querying 140 the service broker's registry. When a desired service is found using the registry, the service requester dynamically binds 130 to the located service provider in order to use the service. The binding occurs using service information which is conveyed in a platform-neutral format.
The operations illustrated in FIG. 1 are designed to occur programmatically, without human intervention, such that a service requester can search for a particular service and make use of that service dynamically, at run-time.
When searching a network-accessible registry for a network-accessible service, it may happen that more than one candidate service is located. For example, if an on-line retailer is searching for a delivery service with which to programmatically schedule delivery of a customer's order, a number of delivery services may be located, including the postal service and package delivery services offered by private companies. It is then necessary to select from among the candidates. Any number of criteria may be used, which may vary widely. For example, when searching for a fee-based service, the cost of the service may be used to rank the candidates. As another example, the reputation and/or name recognition of the service provider may be an important factor in the selection process. As yet another example, the geographic location of the service provider, or the geographic boundaries in which the service is available, may be important.
Service providers are allowed to “tag” entries they publish in a registry with information that allows the published entry to be categorized. The categorization may follow any number of different taxonomies or classification schemes. The tags on the entries can then be used when searching the registry according to values in a taxonomy. (The terms “classification scheme” and “taxonomy” are used interchangeably herein.)
Taxonomies are typically structured as multi-level hierarchies in which successively-deeper levels of the hierarchy provide more granularity or refinement of information. For example, suppose a product-based taxonomy defines the value “123” as representing computers, “1231” as representing computer software, and “1232” as representing computer hardware. The computer software category might be further refined as having a value “12311” for operating system software and “12312” for application software. The “12311” value for operating system software might be further refined (using 6-digit values) to distinguish different operating systems. To locate all operating system software, a search can be carried out using the 5-digit value “12311”; if the searcher is only interested in software for particular operating systems, then the corresponding 6-digit value can be used instead.
One popular classification scheme is the United Nations Standard Products and Services Code, or “UNSPSC”™. UNSPSC defines numeric identifiers for goods and services, where values for different levels of the hierarchy are separated using “.” notation. Another popular classification scheme is a “Dun and Bradstreet number”, or “D-U-N-S”® number. D-U-N-S numbers are nine-digit identifiers of business entities, and are structured to hierarchically link together the entities within a larger corporate structure. The North American Industry Classification System, or “NAICS”, is another classification scheme. It defines 6-digit codes for business sectors in Canada, the United States, and Mexico. A 4-digit industry code is defined by the Standard Industrial Classification, or “SIC”, system. International Standard 3166 from the International Organization for Standardization (“ISO”), which is titled “Codes for the representation of names of countries and their subdivisions”, defines a geographic taxonomy in which countries of the world are defined using alphabetic abbreviations. Subdivisions or regions within countries are also defined, in some cases, for further refinement. Thus the state of Florida is defined as “US-FL”, identifying that this state (“FL”) is a region of the United States (“US”). Other geographic classification schemes include the GeoWeb Geographic Classification (“GGC”) system, which uses 6-digit values identifying a city, state/region, country, and continent.
UDDI registry entries may include any of these described taxonomies. While existing implementations of UDDI registries support several of these taxonomies, the taxonomy support in UDDI is extensible, and thus the taxonomies that can be referenced within a UDDI registry are not limited to the examples described above.
To maximize exposure of a company's goods and services, the company will likely provide multiple tags when categorizing its entries in a registry. Thus, for example, a package delivery service might provide a tag identifying the D-U-N-S number of the corporate entity and a numeric identifier corresponding to a “package delivery” service category in one of the service-related taxonomies. In fact, it is likely that more than one different identifier will be provided to identify this particular service, using the identifier from each taxonomy in which a package delivery service category is defined. The registry entry for the package delivery service will also likely be tagged with “cross-category” tags (i.e., identifiers of one or more categories which are related to package delivery), to further increase exposure of the company's service. For example, the registry entry may be tagged with an identifier for “shipping” or for “business services”, including the possibility for multiple identifiers from multiple taxonomies for each of these related categories.
The volume of data within network-accessible registries is expected to be very large. With multiple tags on the service entries, including the cross-category tags and the duplicated tag values to provide categorization in multiple taxonomies, the search process will be complex. In some cases, available services may be overlooked during a search because a service has been categorized using a value or values in one taxonomy while the search specified a value in another taxonomy. Accordingly, what is needed are techniques for improving categorization in network-accessible registries, thereby enabling improved searching of these registries.