The present invention relates to the field of Internet interactivity and, more, particularly, to a system for accessing and invoking automation objects over the Internet.
In the early days of desktop computing, all applications were monolithic, i.e., they were self-contained, standalone programs. As good as these programs were, a problem still existed with these monolithic applications. Development of traditional software applications required the application executables to be compiled and linked with their dependencies. Thus, every time developers wanted to update the processing logic or implement new capabilities, they would have to modify and recompile the entire primary application in order to do so. In essence, in order to make any changes to any portion of the program, the entire application had to be rewritten. This made it impractical to upgrade the application as minor improvements were made.
This problem was addressed by the introduction of a component software paradigm. A basic principle of component software is that applications can be built from a series of prebuilt and easily developed, understood, and changed software modules called components, each providing a particular function. Thus, applications could be delivered, enhanced, or extended much more quickly and at a lower cost simply by updating or adding new components.
Unfortunately, the component software paradigm suffers a problem similar to that of the monolithic application. Each time the components are enhanced and upgraded, as with applications, the components must be recompiled by the component developers. Either the application developer or the end user would have to monitor for and obtain updated components. The distributed component paradigm has provided a solution to this problem.
Distributed components exist at specific locations. Developers of applications or other components that require a distributed component need only find the component and then use it. The developer does not need to compile or recompile the component. This is done by the creators of the component. Thus, the latest and greatest version of each component is always available to developers and other users.
The widespread use of the Internet, an open environment, presents many new opportunities for distributed component software, and some associated shortcomings as well. The availability of a vast number of vendors, each creating a number of components increases the ease with which applications may be built and increases the flexibility to tailor an application to suits a user""s needs. Unfortunately, the open environment of the Internet means that no one can implicitly xe2x80x9ctrustxe2x80x9d everyone else, as is the case with a traditional client-server system. Thus, all but some dedicated server machines are hidden behind firewalls to protect against unwanted intrusions. Firewalls are barriers that filter packets based on certain criteria, such as a type of packet, and/or based on an Internet address. Firewalls shield servers by controlling traffic between the Internet and the server and controlling which packets may pass through them.
Since only certain types of packets may pass through to the server when firewalls are in place, the ability to access remote components over the Internet is severely limited.
A second problem with today""s Internet is not so much a problem as a shortcoming. Much of the Internet""s use is conducted through the World Wide Web, hereinafter referred to as xe2x80x9cWWW,xe2x80x9d or simply the xe2x80x9cWeb,xe2x80x9d in which linked pages of static content, composed of a variety of media, such as text, images, audio, and video, are described using hypertext markup language (HTML). While the WWW revolution opened the doors to a wealth of information at the fingertips of ordinary people, and while HTML is a very good way of describing static documents, it provides no means to interact with the Web pages. In this static model, a Web browser uses the Hypertext Transport Protocol (HTTP) to request an HTML file from a Web server. HTTP is an Internet protocol designed for rapid and efficient delivery of HTML documents. HTTP is a stateless protocol, meaning that each request to the Web server is treated independently, with the server retaining no xe2x80x9cmemoryxe2x80x9d of any previous connections. The Web server receives the request and sends the HTML page to the Web browser, which formats and displays the page. Although this model provides a client with ready access to nicely formatted pages of information, it provides only limited interaction between the client and the Web server. Furthermore, HTML pages must be manually edited in order to change what the Web server sends to a client , such as a Web browser. Thus, much of the potential richness of the World Wide Web is not fully realized.
One of the biggest challenges to any Web site is to offer dynamic content, i.e., content that changes in realtime. This requires applications to be run from the Web servers. Changing from a static web content to a dynamic web content model would allow WWW content providers to provide interactive business applications rather than merely publishing pages of static information. For example, a travel agency could enable customers to check available flights, compare fares, and reserve seats on flights, rather than merely looking at flight schedules.
HTTP is not well-suited for implementing dynamic Web pages because interacting with Web pages potentially involves a large number of requests. In a typical scenario, a client, such as a web browser, is used to initiate a query, which is sent to an HTTP server operating on a host computer somewhere on the Internet. The query might represent a request for documents containing certain data, or may represent the address, or Uniform Resource Locator (URL), of a particular Web page. The server locates the documents and sends their contents back to the client. In loading the documents for viewing, the client often encounters additional files such as embedded images or sounds, that need to be loaded. The client continues making requests to the server until all of the additional files are received and loaded.
Since HTTP is a stateless protocol, as mentioned above, existing HTTP servers create a separate process for each request received. The greater the number of concurrent requests, the greater the number of concurrent processes created by the server. Unfortunately, creating a process for every request is time-consuming and requires large amounts of server resources such as memory and processor cycles. In addition, creating a process for every request can restrict the server resources available for sharing, slowing down performance, and increasing wait times.
In summary, since most servers are protected by firewalls, only certain types of packets, such as HTTP packets, may pass through to the server, and since HTTP is not suited for interactivity, the goal of providing dynamic content over the Internet is severely limited. Thus, in order to fully realize the potential of distributed component software and of dynamic content on the World Wide Web, there exists a need for software having the ability to access and invoke Automation objects through firewalls.
In accordance with the present invention, a method and software program that provides end users and developers with all the advantages of distributed component software, and capitalizes on the resources available on a computer network such as the Internet, to provide a richer, more interactive content is provided. The invention achieves this result by defining a protocol capable of accessing and invoking methods in Automation objects across the Internet and through firewalls. The protocol, called a Simple Object Access Protocol (SOAP), is an application layer protocol that is layered on top of HTTP and allows Microsoft Component Object Model (COM) Automation objects to be accessed and methods to be invoked over the Internet through Web servers protected by firewalls. xe2x80x9cApplication layerxe2x80x9d refers to the highest layer in the seven-layer Reference Model for Open Systems Interconnection (OSI Reference Model), an international standard for networking by the International Standards Organization (ISO). The application layer is concerned with the semantics of the information exchanged; it ensures that two application processes performing an information processing task on either side of a network understand each other. The OSI Reference Model, as described in xe2x80x9cOpen Systems Interconnection (OSI)xe2x80x94New International Standards Architecture and Protocols for Distributed Information Systems,xe2x80x9d special issue, Proc. IEEE, vol. 71, no. 12, Dec. 1983, is hereby incorporated by reference.
The inventive protocol includes a data structure which encodes, as a SOAP request, the name of the Automation object of interest, a method to invoke in that object, and any valid Automation [in][out] parameters to be exchanged with the object, and creates a client-side SOAP proxy for the Automation object. The range of valid parameter types is defined by the COM Automation Variant type. In addition to Variant data types, the protocol also supports passing ActiveX Data Object Recordset objects (ADO). Variant and Automation xe2x80x9cobjectxe2x80x9d classes such as the ADO Recordset may be used as either [in], [out], or [in, out] parameters. The SOAP proxy packages the SOAP request into a multipart MIME type.
MIME, which stands for Multipurpose Internet Mail Extensions, is an extension to the traditional Internet Mail protocol to allow for multimedia electronic mail. MIME was developed to accommodate electronic mail messages containing many parts of various types such as text, images, video, and audio. MIME is defined in Document RFC 1521 of the Network Working Group, September 1993, which is hereby incorporated by reference.
The SOAP proxy marshals and transfers the multipart MIME-encoded SOAP request to an Applications Programming Interface (API) which acts as a server-side SOAP stub for processing SOAP messages. Marshaling is the process of packaging up the data so that when it is sent from one process to another, the receiving process can decipher the data. The SOAP stub, which is running on the Web server, unpacks and parses the SOAP request, instantiates the COM Automation object, and invokes the method with the marshaled [in] parameters. The SOAP stub also returns any [out], or [in, out], or return, parameters from the COM Automation object instance to the SOAP proxy, and the Automation object instance is reclaimed. Thus, SOAP is a stateless protocol, i.e., one where object lifetimes only extend to one method call, and which are recreated for each call to the object.