In order to run a software application on a computer, it usually requires the execution of an installation program to install the software onto the computer's hard drive or other storage medium. Depending on the size and complexity of the software, the installation program can be quite complicated. An installation program typically manages the installation of multiple files to one or more directories within a file system of the storage medium. Often, existing configuration files are edited in order for the computer's operating system to become aware of the new software. Further, some of the edited configuration files are accessed by other applications. Such changes to a computer's environment may cause problems, such that a newly installed application may not work correctly, or possibly worse, a previously installed application may begin to malfunction. Such problems become a much larger concern when an application is installed on numerous computers across an entire company, sometimes referred to as an enterprise computing environment.
Due to such problems, the deployment and installation of software applications in an enterprise computing environment is a major challenge for the software industry. A significant percentage of all software installations fail in some manner. A software installation failure can be defined as some type of error that exists after the installation of the software. Errors can exist in both the newly installed application as well as in some previously installed application. Such errors include installation time errors, run time errors, performance errors and compatibility errors. An installation time error occurs during the installation of the software itself. Installation errors may result from an incorrectly linked software component, which would have been defined by an actual human, poorly written computer code that has not considered the current configuration of the client system or number of other scenarios. Such an error may prevent the software application from being installed successfully. In such examples, only a portion of the required files are installed, resulting in a partial installation which is incapable of running correctly. Efforts are then required to back out the partial installation to return the computer back to its previous state.
The next type of installation failure is known as a run time error. A run time error is an error that occurs during execution of the software, but often occurs while initially launching the application. One type of run time error may result in a failure to launch the software, with no warning or error messages stating the problem. In result, nothing happens when the software is attempted to execute. Often times one or more cryptic error or warning messages are displayed as to why the application has failed to launch correctly. Other types of run time errors may occur while using the application. Under various scenarios, such as an incorrect version of some software component in the client system, the application may simply stop working during execution of one or more features within the software.
Performance errors reflect problems that allow the application to load and run successfully, but at some reduced level of performance. For example, in a typical installation of Apache Software Foundation™ Apache 5.5 Web Server (hereinafter “Apache 5.5”), the software's ability to resolve one page of Hyper Text Markup Language (hereinafter “HTML”) code and display the output on a webpage may take 5 milliseconds. In a performance hindered installation of Apache 5.5, resolving and displaying a web page may take a full second, causing a drastic reduction in Internet browsing performance.
The last type of installation error involves compatibility problems with other applications. Compatibility problems may allow the newly installed application to run properly, but one or more previously installed applications may fail to work correctly after the new installation. Such errors are often the result from a common file or group of files shared between multiple software applications. For example, the parameters in a given configuration file may be accessed by one or more applications. Such a configuration file may contain parameters required by the software. A newly installed application may alter the parameters in such a way that a previously installed application may be expecting certain parameters to have remained unchanged. In another example, one or more software applications may depend upon the existence of a software service that resides on a computer. For example, many applications require TCP/IP connectivity services, which is the standard communication protocol used by computers to communicate over the Internet. Installation of a new application may replace TCP/IP version 6.2 with 7.0. However, previously installed applications may be incompatible with TCP/IP version 7.0, causing the existing applications to experience errors.
The reasons for such software installation errors vary. Some errors are the result of the installation tools that install software onto a computer. Normally, software is delivered to users as a compact disc (“CD”) or digital versatile disc (“DVD”) or other form of removable storage media. A user would place the disk into the computer's optical drive and follow the instructions for installation. These instructions are human defined tools that physically install the files onto a storage medium. The tools are prone to errors during installation for a variety of reasons. Installation errors may also result from the way software applications are constructed and packaged, rather than the installation tools that apply the software onto a computer system. Installation tools are human created, which allows for the possibility of human-generated errors. The packaging and construction of software are also defined by humans. As a result, the packaging of software may be prone to installation errors as well.
Software is normally constructed of multiple packages. Each package usually has one or more pieces of functionality within the entire software application. Each piece of functionality will further contain numerous individual files containing software code. An individual software file comes in the form of differing types of functionality. For example, a software file could be a shared library file, configuration file, executable file, etc. A shared library is a human understandable listing of variables, functions or procedures that define how certain functions of a software application work. It would also be accessible by one or more other files, hence the reason it is called a “shared” library. A configuration file may also be in human understandable language. Its function is to define variables that are used throughout the software application. For example, one entry in a configuration file might specify that the default installation path for the software is /bin/usr/apps. This variable could be changed by editing the file at any time. An executable file differs in that it is not readily understandable by humans. The executable file is a compilation of one or more files, containing software code, that have been compiled to create a binary file understandable to a computer directly.
In an example of the delineation of functionality between software packages, an accounting application may contain a package that controls accounts receivable. Another package may control the functionality for accounts payable. Such, package-based presentation of a software application is the result of the way software applications are written. Software packages are usually written by numerous software programmers. In order to manage the efforts of each programmer, their tasks are divided into small pieces of functionality where each functional piece can communicate with each other. The division of such functional pieces often results in packages. For example, a software application may comprise 57 packages, with each package comprising hundreds of individual files. One group of software programmers might be tasked with writing the accounts receivable portion and its associated files, with another group responsible for the accounts payable portion and its associated files. Knowing how to divide the functionality between each software package is as much an art as it is computer science.
The division of functionality between packages is the result of compromises. On one side, the more packages that an application comprises, the greater the ability to divide functionality between each package, resulting in a more compact and compartmentalized design. For example, if a software application contains 20 packages, the amount of functionality required in each package is far more than if the same application had 200 packages. On the other side of the compromise, the smaller the number of packages, the easier it is for a system administrator to grasp the division of functionality. Typically, a system administrator is the person or persons within an enterprise that is responsible for installing and maintaining software applications in the enterprise environment. When installing a software application comprised of individual packages the administrator executes an initial installation script that begins the installation process. Depending on the specific software application and its complexity, an installation script may pose one or more questions to the administrator. Such questions might involve where to physically install the software within the computer's file system, what optional features or services are desired, or the privilege level for installing the software. Conventionally, the software installation process is script driven. Installation scripts set forth the above types of questions and record the answers for later use during the installation. For example, if a script asks where to install an application, the provided answer would then be used during installation to install the application in the desired location in the file system.
One way that an administrator is able to reduce the amount of interaction required during an installation is to modify the installation scripts to remove the questions and enter the answers directly into the script. Hence, when the installation script is executed, no questions are asked, as the answers are already provided.
Along with the compromises mentioned above, there are additional problems which continue to escalate over the life cycle of a typical application. These problems are mostly centered on a concept sometimes referred to as “software drift.” Once a software application and its division of functionality between packages is defined, it becomes familiar to the system administrators who install and maintain the application. If the division of functionality between packages changes in the future (i.e., it “drifts”) whether from the fixing of software bugs, functionality improvements or additions, etc., this may cause difficulty for the system administrators who were already familiar with the previous delineation of packages. Hence, software drift can create a growing conflict between the needs of the administrators and the preferences of the software developers as versions of a software application incrementally change. For example, when a software application is originally created, the original definition of the individual packages within the application likely involved a compromise between the functional interaction between the individual files that make up the package and something comprehendible by system administrators. However, as software versions increase, it is likely that the delineation between the packages will change which in turn increases the complexity of the installation as well as the potential for various installation errors.
To address these problems, packaging formats for software are continually evolving. However, each change tends to represent minor or incremental improvements over the prior approach that only address the results of the inherent problems rather than the inherent problems of software packaging. Much of the hesitation to change how software is packaged is due to the unwillingness of software vendors to change the way software development projects are designed. A software application is a self-contained entity that can be delivered on a CD/DVD-Rom. Rarely would this application have any relationships to any other software application. This is one of the major problems with the current method for software packaging. All of the decisions and software dependencies are made at the time of the software creation. Hence, the developers are aware of the various computing system configurations and generally attempt to account for them, but they know little about the uniqueness of the particular computing systems the software is installed on.
There are a number of software packaging formats in use today, many of which date back to the 1980s when the current problems of software packaging originated. FIG. 1 is a block diagram illustrating the general components in a conventional computer software package. There are five major components to a basic software application 100. The core software inventory 110 is the main component that contains the actual files of the software application 100. These files are organized into packages. The core software inventory 110 is the eventual compilation of bits to be installed onto a computing system. One or more of these files are often stored in a compressed format.
Functional relationships with other packages 120 are the second major component of a basic software application 100. A functional relationship is a requirement, by the software to be installed, that something else must exist before installation of the software application to run properly. For example, a functional relationship may require that an additional software application or service be installed before the new software application can be installed. In order to install Apache 5.5, for example, TCP/IP services should be installed on the system. In other examples, a functional relationship may require that certain services be installed concurrently with the software to be installed, or that certain software or services not be present on the computing system due to incompatibilities between certain software applications and services.
Finally, in yet another example, a functional relationship may require that one or more software applications or services be de-installed before installation of the new software because the new software may replace one or more packages.
The package manifest 130 is the third component in the basic software package 100. The package manifest 130 involves a list of all of the files with the packages that make up the basic software application 100. Thus, the manifest lists all of the files in the core software inventory 110. The manifest is often used for validation purposes in order to confirm that each and every file required for installation is accounted for within the software inventory core.
A pre-installation script 140 is the next component in the basic software package 100. This script describes what needs to be validated prior to the installation of a software application. Generally speaking, a script is a software file that sequentially lists steps that are to be executed. For example, a script may list steps for creating a new directory, moving files into it from another location, validating the size of the files as being within a threshold range and sending an email if the files are outside the threshold range. There are numerous scripting languages that exist for writing scripts, such as: perl, python, tcl, etc. As mentioned above, there can often be numerous dependencies that exist between the software to be installed and other software or services that may be needed, etc. Other validation requirements may be included in a pre-installation script 140 aside from dependencies. For example, the pre-installation script may look to determine if there is enough disk space to install the software application. Another example is whether there is enough memory available to run the application effectively. Further, the pre-installation scripts may also serve the purpose of asking a system administrator questions regarding the installation. Examples of such questions were discussed above.
A post-installation script 150 is the final component in the basic software package 100. Similar to the pre-installation script mentioned above, the post-installation script 150 describes what needs to be performed after installation of the software application 100 has been completed. An example of such a script entry may be that the computing system needs to be rebooted in order for new startup processes to be loaded or old ones to be deleted. In another embodiment, the post-installation script 150 may require de-fragmentation of the hard drive, depending on the nature of the installation and where the files are store on the hard drive.
FIG. 2 is a block diagram that illustrates the functional relationships between the packages that comprise Software Application A. Software application A (200) comprises packages 1-5 (210-250). Each package encapsulates a group of one or more functions required to install the application. Coming out of each package are a number of straight lines connected to other packages. These lines 205 illustrate the functional relationships that exist between packages. For example, package 1 (210) has an interrelationship with packages 2 (220), 3 (230) and 5 (250). Hence, it is not possible to install package 1 (210) without the inclusion of packages 2, 3 and 5 as each of these packages interrelate to one another. For example, package 1 may provide the function of accounts receivable within an accounting software application. Since accounts payable (e.g., package 2 (220)) is an essential part of the software application, it would not be possible to install package 1 without also installing package 2. Further, package 2 (220) also has a functional relationship 205 to other packages. Packages 3, 4 and 5 (230, 240 and 250) also have functional relationships 205 to other packages. In this example, there are only five packages, which are quite manageable for a system administrator. However, if those five packages are extracted down to the granular level (not shown for simplicity), there may be thousands of files with thousands of functional relationships between the files. A typical system administrator would be greatly challenged to comprehend the hierarchy and functional relationships of so many files.
Much of the recent development of software packaging has focused on improvements in three core components of the basic software package, namely functional relationships on external application, pre-installation and post-installation scripts. Software vendors are putting forth much effort on making improvements to the pre-installation and post-installation scripts and their descriptions. One of the original challenges to software vendors was that these scripts were not well validated and could not adjust to specific installation needs. One attempt at addressing this challenge is by writing scripts with meta languages, such as XML. This may allow for a more syntactical runtime verification of these scripts. An example of such XML-based install scripts is the Debian packaging format used by many recent versions of the Linux™ operating system.
Management and validation of functional relationships is the other core component where much effort is being placed on improvements to the basic software application. The generation of functional relationships are human defined. This means that software developers have to determine which functional relationships are required before installation of a software application. As such, the creation of functional relationship within software is prone to human errors since they are artificially created during development and do not necessarily correspond to the unique functional relationships that may occur during installation. In other words, conventional functional relationship creation occurs at a point in time before the installation of a software application. Hence, these functional relationship are generic in that they exist for all computing system configurations without any ability to change depending on the uniqueness of each computing system environment. Another problem that can exist from the human declaration of functional relationships are circular relationships between individual software files that cannot be resolved because the relationships are created based on artificial constraints. A circular relationship occurs when two software files or functional blocks of software code are both declared to relate to each other. As a solution, developers are creating automated validators that help define functional relationships. Such validators can then be used validate the functional relationships in a software package.
As described above, there are inherent problems with the way that software is conventionally packaged and installed on computing systems today.