1. Field of the Invention
The invention generally relates to an approach to physically compose a specified set of services by semantic schema matching between the API schemas of source and destination services.
2. Description of the Related Art
Within this application several publications are referenced by Arabic numerals within parentheses. Full citations for these, and other, publications may be found at the end of the specification immediately preceding the claims. The disclosures of all these publications in their entireties are hereby expressly incorporated by reference into the present application for the purposes of indicating the background of the present invention and illustrating the state of the art.
With the emergence of web services, an increasing number of organizations are putting their business competencies as a collection of web services. These components range from data sources to analytical tools and applications, and business objects. With such services becoming available, it is conceivable that other users could integrate them to create new value-added services in ways that were not anticipated by their business practices to the dynamic nature of the web. Thus, composing existing services to obtain new functionality will prove to be essential for both business-to business and business-to consumer applications.
Several web service development environments are currently offered in which tools are provided for manual composition of web services. Frequently this requires developers to examine the application program interface “APIS” (input and output messages) of web services and determine a correspondence between the message attributes in order to chain services during composition. While manual service composition can be accomplished through such explicit programming efforts, this is not only tedious in terms of development time and effort but also is not scalable as services are added or deleted. Efforts in Semantic Web [1] have tried to address this problem by explicitly declaring preconditions and effects of web services with terms precisely defined in ontologies. The set of web services to compose is determined using goal-directed planning and rule-based inference starting from a high-level specification of a desirable goal [4, 6]. When the composition sequence for services has been already specified in the query, automatic service composition reduces to finding corresponding attributes in the input and output messages of a chain of services to allow them to be physically invoked in a chain. Semantic Web approaches expect a close match in the source and destination ontological descriptions of messages of web services to enable their chaining. In practice, since the web services are derived from widely distributed sources, it is unlikely that similar terminology or abstract data structures are used in web services. In such cases, semantic information is necessary to discover the correspondence between source and destination.
FIG. 1 shows an example chain in which a data source is chained with an analytics application where the intention is to cluster the data produced by database web service. While this is a reasonable request from an end-user, automatically composing the two services actually requires flowing the correspondence between the attributes of the output message of the database web service (source) with the input message expected by KMeans web service (destination) as shown. Notice that the names used to denote the attributes follow typical naming conventions used by programmers for class variables. Trying to assign friendly names following an ontology may not be possible in such cases, particularly when automatic Java to WSDL converters offered in today's tools are used to produce the WSDL documents.
The schemas representing the abstract data types characterizing the input and output messages of the destination and source services are shown in FIG. 1. From this figure, we can note a number of difficulties associated with matching of these schemas, namely, (1) the number of attributes in the schemas may not be the same, (2) the names of the attributes are frequently concatenation of abbreviated words so that direct lookup of an ontology for name similarity may not be sufficient as proposed in Semantic Web methods (3) the structural information in the schemas may need to be captured to disambiguate matches, (4) and type inference may be needed to detect similarity between attributes (eg., Int to float is lossless while float to int association results in loss of precision), (4) a source attribute may be split across multiple destination attributes, (5) multiple sets of attribute matches may be possible (6) some associations may depend on the existence of conversion functions (eg., 1D to 2 D Array Converter to convert double[ ] to double[ ][ ]).