With the wide adoption of cloud computing and continuous server consolidation, there are huge requirements to migrate an existing software stack (i.e., a set of software subsystems or components cooperating with each other to provide some solution, e.g., an operating system, middleware, a database and an application) from a source environment to a target environment (e.g., cloud or consolidated servers). However, migrating an existing software stack to cloud or consolidated servers is an extremely complicated activity.
One of the key challenges is to understand the source environment in terms of discovering software stack configurations and depended components given the situation that such software have been running for a long time and already have many configurations which are not well documented.
Software, especially distributed enterprise software, is highly diverse. These diversities lie in:
1. In terms of software configuration descriptions, it comprises: standard configurations using standard (e.g. JEE, OSGi) compliant deployment descriptors or annotations to specify configurations; ad-hoc configurations that use non-standard ad-hoc metadata to specify configurations in files like *.xml, *.properties, etc.; and hard-wired configurations that hard-code some configurations in binary files.
2. In terms of software running environment configuration, different products have their specific configuration formats; and different versions of the same product may also have different configurations, e.g., the configuration file formats of various JEE application server software are different from each other;
3. In terms of software depended resources, it comprises: common third party frameworks (e.g., Spring, Hibernate, Axis, etc.); specific third party components (e.g., cplex.jar, cplex.dll, cbc.dll, etc.); and native components (e.g., *.jar, *.dll, *.war, *.ear, etc.)
Given the above-mentioned diversities of software configurations and a huge number of files (tens thousands or even more) in a source environment, it is a pending problem how we can identify a reasonable size of files that include all the software configurations and depended resources that are necessary for the running of the migration target application.
Currently there are the following solutions to this problem:
The first is questionnaire, e.g. using spreadsheets or Word documents, or tools like Rational Focal Point, etc. However, this method relies heavily on the human knowledge of the source environment and software, and thus is an error-prone method.
The second is automation approach on known products (e.g. WebSphere Application Server (WAS)). For example, RAF (Rational Automation Framework), which can extract the configurations of specific products (e.g. WAS); UCM (Unix Configuration Migration Tool), which is a product specific plug-in for extracting product specific configurations; TADDM (Tivoli Application Dependency Discovery Manager), which extracts software dependencies based on agents which are product dependent. However, this automation approach relies on knowledge of specific products, and cannot process unknown products, native components or ad-hoc configurations.
It can be seen that an improved method for determining a range of files to be migrated for the migration target application is needed in the art.