The present invention is directed to a method and system for preprocessing and packaging application files for use by a network-based streaming application service provider.
The Internet, and particularly the world-wide-web, is a rapidly growing network of interconnected computers from which users can access a wide variety of information. Initial widespread use of the Internet was limited to the delivery of static information. A newly developing area of functionality is the delivery and execution of complex software applications via the Internet. There are two basic techniques for software delivery, remote execution and local delivery, e.g., by downloading.
In a remote execution embodiment, a user accesses software which is loaded and executed on a remote server under the control of the user. One simple example is the use of Internet-accessible CGI programs which are executed by Internet servers based on data entered by a client. A more complex systems is the Win-to-Net system provided by Menta Software. This system delivers client software to the user which is used to create a Microsoft Windows style application window on the client machine. The client software interacts with an application program executing on the server and displays a window which corresponds to one which would be shown if the application were installed locally. The client software is further configured to direct certain I/O operations, such as printing a file, to the client""s system, to replicate the xe2x80x9cfeelxe2x80x9d of a locally running application. Other remote-access systems, such as provided by Citrix Systems, are accessed through a conventional Internet Browser and present the user with a xe2x80x9cremote desktopxe2x80x9d generated by a host computer which is used to execute the software.
Because the applications are already installed on the server system, remote execution permits the user to access the programs without transferring a large amount of data. However, this type of implementation requires the supported software to be installed on the server. Thus, the server must utilize an operating system which is suitable for the hosted software. In addition, the server must support separately executing program threads for each user of the hosted software. For complex software packages, the necessary resources can be significant, limiting both the number of concurrent users of the software and the number of separate applications which can be provided.
In a local delivery embodiment, the desired application is packaged and downloaded to the user""s computer. Preferably, the applications are delivered and installed as appropriate using automated processes. After installation, the application is executed. Various techniques have been employed to improve the delivery of software, particularly in the automated selection of the proper software components to install and initiation of automatic software downloads. In one technique, an application program is broken into parts at natural division points, such as individual data and library files, class definitions, etc., and each component is specially tagged by the program developer to identify the various program components, specify which components are dependent upon each other, and define the various component sets which are needed for different versions of the application.
One such tagging format is defined in the Open Software Description (xe2x80x9cOSDxe2x80x9d) specification, jointly submitted to the World Wide Web Consortium by Marimba Incorporated and Microsoft Corporation on Aug. 13, 1999. Defined OSD information can be used by various xe2x80x9cpushxe2x80x9d applications or other software distribution environments, such as Marimba""s Castanet product, to automatically trigger downloads of software and ensure that only the needed software components are downloaded in accordance with data describing which software elements a particular version of an application depends on.
Although on-demand local delivery and execution of software using OSD/push techniques is feasible for small programs, such as simple Java applets, for large applications, the download time can be prohibitively long. Thus, while suitable for software maintenance, this system is impractical for providing local application services on-demand because of the potentially long time between when the download begins and the software begins local execution.
Recently, attempts have been made to use streaming technology to deliver software to permit an application to begin executing before it has been completely downloaded. Streaming technology was initially developed to deliver audio and video information in a manner which allowed the information to be output without waiting for the complete data file to download. For example, a full-motion video can be sent from a server to a client as a linear stream of frames instead of a complete video file. As each frame arrives at the client, it can be displayed to create a real-time full-motion video display. However, unlike the linear sequences of data presented in audio and video, the components of a software application may be executed in sequences which vary according to user input and other factors.
To address this issue, as well as other deficiencies in prior data streaming and local software delivery systems, an improved technique of delivering applications to a client for local execution has been developed. This technique is described in co-pending U.S. patent application Ser. No. 09/120,575, entitled xe2x80x9cStreaming Modulesxe2x80x9d and filed on Jul. 22, 1998. In a particular embodiment of the xe2x80x9cStreaming Modulesxe2x80x9d system, a computer application is divided into a set of modules, such as the various Java classes and data sets which comprise a Java applet. Once an initial module or modules are delivered to the user, the application begins to execute while additional modules are streamed in the background. The modules are streamed to the user in an order which is selected to deliver the modules before they are required by the locally executing software. The sequence of streaming can be varied in response to the manner in which the user operates the application to ensure that needed modules are delivered prior to use as often as possible.
Although an improvement over existing streaming technology, the xe2x80x9cStreaming Modulesxe2x80x9d methodology generally operates on software-module boundaries and therefore the streaming is flexibility is constrained to some extent by the structure of the application files. In addition, in one embodiment, client-side streaming functionality is added to the streamed program through the use of stub routines inserted into the program code itself. Thus, the source or object code of the program modules must be modified to prepare them for streaming.
In a newly developed application streaming methodology, described in co-pending U.S. patent application entitled xe2x80x9cMethod and System for Executing Network Streamed Applicationsxe2x80x9d, filed concurrently with the present application, the client system is provided with client-side streaming support software which establishes a virtual file system (xe2x80x9cVFSxe2x80x9d) and connects it to the client""s operating system such that the virtual file system appears to be a storage device. The VFS is configured as a sparsely populated file system which appears to the operating system to contain the entire set of application files but, in reality, will typically contain only portions of selected files. Client streaming functionality is provided to process streamlets or blocks of individual files and add them to the VFS as appropriate.
The present invention is directed to a method and system for preprocessing and packaging application files for use by a network-based streaming application service provider which is configured to stream the software to a client for client-side execution. The application files are divided into sets of streamlets, each of which corresponds to a data block in a particular application file at a particular offset and having a predefined length. Preferably, the data blocks are equal in size to a code page size used during file reads by an operating system expected to be present on a system executing the application. For Microsoft Window""s based systems, 4 k code pages are used.
Each streamlet is then compressed and the compressed streamlets are packaged into a streamlet repository. While the original boundaries between application files are not preserved in this repository, index data is provided to permit specific streamlets to be retrieved with reference to the source filename and an offset in that file. In addition, an application file structure which indicates how the various files associated with an application appear to a computer when the application is locally installed is determined and this information packaged with the repository. With reference to the application file structure, precompressed streamlets corresponding to specific code pages in a particular application file can be selectively extracted from the streamlet repository.
According to a further aspect of the application preprocessing method, the application is selectively installed on a suitable test machine. Prior to installation, a xe2x80x9csnapshotxe2x80x9d of the environmental condition of the test machine, such as environmental variable settings, and contents of system control files, etc., is taken. This starting condition is compared with the environmental condition after the application has been installed and the environmental changes due to the application installation are determined. This information is recorded in an environmental install package. When the application is streamed to a client, the environmental install package can be used to properly configure the client system during a virtual installation of the streaming application.
After the application has been installed on the test machine (or another machine), the application is started and the sequence in which the various file blocks are loaded are monitored during the application startup process. Those application streamlets which are required to enable execution of the application to be initiated, and preferably those streamlets required to have the application run to a point where user interaction is required are identified. Those identified streamlets form a startup streamlet set which represents a minimal portion of the application which should be present on the client system for the application to begin execution.
According to a further aspect of the invention, in addition to packaging the application into a streamlet repository and identifying environmental install data and a startup streamlet set, the application is executed using test inputs and simulated or real user interaction and the sequence of code and data loads generated by an operating system as it executes the application are monitored. This information is used to generate a predictive model of the order in which the application file blocks are loaded during execution. The predictive model can be used by a streaming application server to determine an optimal order in which to send streamlets to a client to minimize the likelihood that the application will require a portion of an application file before the corresponding streamlet has been sent to the client.
Through use of the new methods, preprocessed application streaming packages can be easily and quickly generated for a variety of applications. The packages can include the streamlet repository, application file structure, environmental install package, startup set definition (or the startup streamlets themselves combined in a cluster), and predictive model. The packages can then easily be provided to one or more streaming application servers.
Advantageously, the streaming application package does not require any modifications to the application code or data itself. Further, because the application files are segmented into streamlets in a manner which is independent of the actual code or data content of the files, a wide variety of application packages can be easily prepared for streaming in this manner. In addition, the application streamlet repository as well as the additional elements of the application package can be formatted and stored in any manner suitable for retrieval on the server system. Thus, the server operating system and data storage environment can be selected without regard for whether it is compatible with the application""s environment and the manner in which the application is stored. Finally, precompressing each streamlet prior to delivery to the server decreases the net size of the streamlet repository and increases the overall streaming rate without increasing server load by compression on-the-fly or the time required for the server to extract the streamlets from the repository.