A cloud computing environment provides computation, software, data access, and storage services. Cloud computing describes a new supplement, consumption, and delivery model for Information Technology (IT) services based on Internet protocols, and it typically involves provisioning of dynamically scalable and often virtualized resources.
Cloud computing providers deliver applications via the internet, which are accessed from web browsers, desktop and mobile apps. All software, e.g., business software, applications and data are stored on servers at a remote location (e.g., a computing data center).
As known, “virtual” and “cloud computing” concepts includes the utilization of a set of shared computing resources (e.g., servers) which are typically consolidated in one or more data center locations. For example, cloud computing systems may be implemented as a web service that enables a user to launch and manage computing resources (e.g., virtual server instances) in third party data centers.
Different computing resources may be created within a cloud computing infrastructure or data center. For example, a resource may include all the components necessary to run application software, and may include, e.g., UNIX, Linux, or Windows operating systems (O/S), middleware, and specific application software or data, as desired by a user. The information for configuring the resource to be created is referred to as an image. After an image has been created (instantiated), the resource becomes an instance (a server instance).
It is the case that migrating software stacks from a “physical” to a “virtual” environment provides an opportunity to standardize the software components that are used. As data centers may host hundreds or thousands of hosts, e.g., servers (HTTP/Web-servers, database servers, developer servers, etc) containing OS, middleware and other software components, making an inventory and understanding how all these software components are used is a challenge. For example, in large data centers there may multiple servers having images including different versions and/or customizations of the same software application/package.
It becomes a further challenge to partition hosts (or their virtual images) into groups with a similar set of software such that each of those groups can then be transformed into a more standardized set of software.
To perform this kind of analysis for a data center or current cloud computing infrastructure is very labor intensive and error prone, especially since the number of hosts (or virtual “images”) and the software components they contain can be large.
Further, analysis problems may occur after migration, in normal operation or during “steady state”. In steady state, the operators need to find groups of images that share vulnerability to a virus, share some bug, etc., such that they must all be upgraded with a new feature, or could be aggregated or otherwise simplified.
While tools exist that make an inventory of the software in an environment: e.g. Tivoli Application Dependency Discovery Manager (TADDM) or Mirage™ (both systems available from current Assignee International Business Machines Corporation), these inventories and the user interfaces to them focus on individual machines or images. It is not easy for users to find similarities between environments.
In the area of data mining in particular, matrix visualizations and clusters do not provide the operations specific for image transformations.