1. Field of the Invention
The present invention generally relates to a semantic interface for publishing a web service to and discovering a web service from a web service registry, and more particularly to an interface that facilitates publication of a web service by providing contextual information, without manual entry of metadata, and that facilitates discovery of a web service by allowing contextual searching.
2. Related Art
The World Wide Web provides access to a vast amount of information in the form of hypertext markup language (HTML) documents and also provides access to web-based applications, which are software programs, or groups of programs, that provide a particular functionality to the end user, such as database and Internet searching, electronic commerce, banking, travel planning, etc. Such applications are sometimes implemented using “web services”, which is a standardized way of integrating web-based application modules.
Unlike traditional client/server models, such as a web server/web page system, web services do not provide a user-interface. Instead, web services share business logic, data, and processes through a programmatic interface across a network, i.e., these web applications interface with one another, rather than the user. Web services are modular, in the sense that they are designed to be added by web application developers to a graphical user-interface (GUI), such as a web page or an application program, to provide the specific functionality of the web service to the user.
Web services allow different applications from different sources to communicate with each other without time-consuming custom programming. Also, because communication between web services is in a common language, Extensible Markup Language (XML), they are not tied to one particular operating system or programming language. For example, applications written in Java can communicate with applications written in Perl, and Windows applications can communicate with UNIX applications.
XML documents may be used in conjunction with XML schemas, which describe and constrain the content of the XML documents. XML schemas are written in an XML-based language, such as the W3C XML Schema Definition Language (the standard defining that language, as adopted by the W3C recommendation of May 2, 2001, is hereby incorporated herein by reference). Schemas create an XML vocabulary for expressing business rules for data contained in XML documents and allows for the “validation” of XML documents, i.e., allow determination of whether the data in the XML document meets the constraints defined in the schema. Such vocabularies may be created such that element and attributes names are associated with a particular “namespace”, which is a collection of names that are identified by a uniform resource identifier (URI) reference (the generic term for all types of names and addresses that refer to objects on the World Wide Web). The use namespaces allows XML documents to use elements and attributes that have the same name, but come from different sources.
Web services employ simple object access protocol (SOAP), web services description language (WSDL), and universal description, discovery and integration (UDDI), which are open standards that are designed to be implemented over an Internet protocol backbone. Generally speaking, SOAP is used to encapsulate and formalize the data transferred between web services, WSDL is used to describe the web services interface and technical details, and UDDI acts as a directory of the web services that are available. Used primarily as a means for businesses to communicate with each other and with clients, web services allow organizations to communicate data, without intimate knowledge of each other's computer systems.
SOAP is a lightweight XML-based messaging protocol used to encode the information in web service request and response messages before sending them over a network. SOAP messages are independent of any operating system or protocol and may be transported using a variety of Internet protocols, including simple mail transfer protocol (SMTP), multipurpose Internet mail extensions (MIME), and hypertext transfer protocol (HTTP).
WSDL is an XML-formatted language used to describe the web service as a collection of communication endpoints capable of exchanging messages, and the interface and technical details for doing so. As the service interface specification, WSDL is an important part of the UDDI registry model, which is an XML-based worldwide business registry of web services. UDDI is implemented as a web-based, distributed directory that enables businesses to list themselves and their available web service on the Internet and “discover” each other, in a manner analogous to a traditional phone book's yellow and white pages.
Referring to FIG. 1, web services may be registered, or “published”, in a UDDI web service registry 110, by a the web service provider 120. UDDI is a specification for one or more web-based registries that provide information about a business or other entity and its technical interfaces, that is, its application program interfaces (APIs), which are the set of routines, protocols, and tools for building software applications. By accessing a public UDDI operator site, a web service requestor 130 or other user can search for information about web services that are made available by or on behalf of a business. This information allows others to discover what technical programming interfaces are provided for interacting with a business for purposes such as electronic commerce, etc. The web service requestor can then bind the desired web service to a web application using the information acquired from the web service registry and from the WSDL.
The information that a business or other entity can register using UDDI includes several kinds of simple data that help others determine how to access the business' web services, the functions that those services perform, and how to access the services. For example, this information typically includes business name, business identifiers (e.g., D&B D-U-N-S Number®, etc.), and other contact information. The information also includes classification information, such as industry codes and product classifications, as well as descriptive information about the services that the business makes available. The registered information also includes the location of web services by providing a uniform resource identifier (URI), uniform resource locator (URL), email address, etc., through which each web service is accessed.
In the past, to build compatible software, two companies only had to agree to use the same specification, and then test their software. By contrast, web services are designed to be shared on an ad hoc basis. Therefore, web service programmers need a way to distinctly identify public specifications (or, alternatively, private specifications shared only with select partners), so that these specifications are discoverable by web service users. This identifying information relating to the specifications, or metadata, is provided by the tModel within UDDI. Thus, the tModel mechanism facilitates discovering information about services and other technical foundation concepts for web services that are intended by a business or other entity to be exposed for broad use.
For example, a business may buy a software package that allows automatic acceptance of electronic orders via the Internet. The business could “advertise” the availability of this electronic commerce capability using one of the public UDDI operator sites, so that potential business partners and customers could find out that they can accept orders electronically. In addition, the software may be configured to automatically consult one of the public UDDI sites and identify compatible business partners, by looking up businesses identified by the user and determining which ones have already advertised support for the electronic commerce services provided by the software package.
The software package may accomplish this by taking advantage of the fact that a tModel has been registered within UDDI and a corresponding tModel key (“tModelKey”) is assigned at the time of registration. This tModel represents the technical details for the electronic commerce capability. Individual partner capabilities are stored within UDDI as information about “service bindings”, each of these bindings references the tModel that represents or references the specific interface that the software package understands.
The tModel keys within a services binding description may be thought of as a fingerprint that can be used to trace the compatibility origins of a given web service. Since many such services will be constructed or pre-programmed to be compatible with a given, well-known interface, references to the tModel serve to identify the properties associated with a given service. For software companies and programmers, tModels provide a common point of reference that allows technical details of the service to be registered, and compatible implementations of those services to be easily identified. For businesses, the tModel greatly reduces the work in determining which particular bindings exposed by a business partner are compatible with the software used in within the business. Finally, for standards organizations, the ability to register information about a specification and then find implementations of web services that are compatible with a standard helps customers immediately realize the benefits of a widely used design.
As noted above, UDDI registry entries may specify one or more classifications, or category codes for the business entity publishing the web service, such as, for example, NAICS, UN/SPC, and SIC codes, etc. Other classification systems designate geographic information or membership in a given organization. This classification information, in turn, allows simple searching, such as Boolean key-word searching, to be done on the information contained in the public registries.
Among the shortcomings of such conventional approaches to web registry publication and discovery is that the web provider must manually enter the publication information (e.g., the classification information) and such information does not provide a contextual basis for semantic-based searching. For example, a keyword search may be performed for a web service to compute a “balance summary” for a financial transaction instrument such as a corporate card. However, because the search terms “balance summary” are applicable in many different business areas, the search may return too many results to be useful, even when the search is limited to specific business classifications.
To supplement the system of classification by predefined categories, such as industry codes, product codes, geography codes and business identification codes (such as D&B D-U-N-S Numbers®), etc., UDDI allows other search services to use this core classification information as a starting point to provide contextual indexing and classification. This capability allows a business to extend the support that UDDI operators provide for managing validated taxonomies, by allowing third parties to create and check taxonomies.
“Unchecked” taxonomies also may be used for categorization and identification without the need for UDDI to perform a specific call-out to a validation service. Organizations that choose to make a particular taxonomy available for categorization or identification can register it as an unchecked taxonomy. Unchecked taxonomies are registered by simply registering a new tModel and classifying that tModel as either an identifier or a categorization taxonomy. On the other hand, “checked” taxonomies may be created from these new taxonomies, if the publisher of a new taxonomy wishes to make sure that the categorization code values or identifiers being registered represent accurate and validated information.
UDDI also supports third parties who wish to maintain augmented UDDI registries, which include additional descriptive information that does not fit into the standard UDDI framework. For example, Infravio (http://www.infravio.com) provides a UDDI-based registry-repository platform for service oriented architectures (SOA), which are interrelated collections of web services.
However, a significant shortcoming of these approaches is that they require manual entry and updating of the tModel or supplemental registry information after the web service has been created. Often, such information changes over time as a web service is modified or used in new contexts. However, the tModel or augmented UDDI registries must then be manually updated to reflect such changes.
Another approach to providing enhanced discovery capabilities for those seeking web services is to provide a semantics-based searching capability, which seeks to specify the meaning of the resources described on the Web. For example, the “Semantic Web” is a collaborative effort led by W3C that seeks to provide a common framework that allows data to be shared and reused across application, enterprise, and community boundaries. It is largely based on the resource description framework (RDF), which integrates a variety of applications using XML for syntax and URIs for naming.
RDF is a language for representing information about resources in the World Wide Web. It is particularly intended for representing metadata about web resources, such as the title, author, and modification date of a web page, copyright and licensing information about a web document, or the availability schedule for some shared resource. However, by generalizing the concept of a “web resource”, RDF can also be used to represent information about things that can be identified on the Web, even when they cannot be directly retrieved on the Web. Examples include information about items available from on-line shopping facilities (e.g., information about specifications, prices, and availability), or the description of a Web user's preferences for information delivery.
RDF is intended for situations in which this information needs to be processed by applications, rather than being only displayed to people. RDF provides a common framework for expressing this information so it can be exchanged between applications without loss of meaning. Since it is a common framework, application designers can leverage the availability of common RDF parsers and processing tools. The ability to exchange information between different applications means that the information may be made available to applications other than those for which it was originally created.
The web ontology language (OWL) is a language for defining and instantiating web ontologies. “Ontology” refers to the science of describing the kinds of entities in the world and how they are related. An OWL ontology may include descriptions of classes, properties and their instances. Given such an ontology, the OWL formal semantics specifies how to derive its logical consequences, that is, facts not literally present in the ontology, but entailed by the semantics. These entailments may be based on a single document or multiple distributed documents that have been combined using defined OWL mechanisms. While there are logical synergies between the semantics of RDF and OWL, OWL extends the notion to relationships of things and the relationships of the data that describe them. Although a powerful tool, RDF tends to be focused on the descriptive framework and namespace for an artifact. One approach is to use RDF as the framework in which to declare one's metadata for an artifact, potentially taking advantage of standards from several sources such as Dublin Core Metadata Initiative (DCMI) and OWL.
An ontology differs from an XML schema in that it is a knowledge representation, where an XML Schema is a set of declaratives and constraints that describe an XML message format. Most industry-based web standards consist of a combination of message formats and protocol specifications. These formats have been given an operational semantics, such as, for example, “upon receipt of this PurchaseOrder message, transfer Amount dollars from AccountFrom to AccountTo and ship Product.” But this specification is not designed to support reasoning outside the transaction context. Thus, a general advantage of the onotological approach is the availability of tools that provide generic support that is not specific to the particular subject domain, in contrast to a system based on a specific industry-standard XML schema.
However, semantic-based approaches still rely on a manually-entered set of keywords or classification information for each web service, which must be manually updated to reflect any changes in the functionality of the web service. The generic onotologies created under such semantic-based approaches may not be well-suited for a particular industry. Moreover, the creation of these onotological associations is generally a static process outside the control of the publisher of the web service.
Given the foregoing, what is needed is a system, method and computer program product for describing a semantic interface for publishing a web service to and discovering a web service from a web service registry. More specifically, what is needed is a system for augmenting the basic UDDI registry to include contextual information in a manner that is controllable and updateable by the web service publisher, without requiring manual entry and monitoring of the registry information.