Distributed computing systems, such as cloud computing environments, typically consist of a multitude of resources, many of which operate in relative isolation. In addition, many of these resources are themselves comprised of multiple sub-components. For instance, a given cloud computing environment often hosts multiple applications and services, each of which often utilizes multiple virtual and non-virtual assets such as server instances, data storage instances, as well as non-virtual resources, such as third party access systems, and various “bare metal” resources.
A long standing problem associated with the diversity of components and elements operating in a given cloud computing environment has been the inability to accurately and efficiently identify and correlate related events taking place in different portions of the cloud. One reason correlating events in the cloud has proven so difficult is the fact that many of the resources hosted in a given cloud computing environment maintain their own log data in relative isolation. Consequently, while it might be easily understood why a given asset in a cloud computing environment experienced a given event if it were known that a related trigger event had taken place in another portion of the cloud, without this correlation of log data entries from two different log data sources, there is often no explanation for the occurrence of an event, or the behavior of a given asset, in the cloud computing environment.
As an illustrative example, assume a service provided through a cloud computing environment employs multiple virtual machine instances and the virtual machine instances are accessed via the Internet using variable sets of IP addresses assigned to the service by a cloud computing environment provider hosting the service. In this case, each of the virtual machine instances would typically maintain its own internal log data recording various log entries related to the events associated with that virtual machine instance, i.e., each virtual machine instance would be a source of log data associated with that virtual machine instance. In addition, in this specific illustrative example, the cloud computing environment provider would also maintain log data recording events associated with the cloud infrastructure, i.e., the cloud computing environment provider would be a source of log data entries associated with the cloud infrastructure. In addition, the service would typically maintain its own log data, often consisting of the collection of log data from each of the associated virtual machine instances.
For the purposes of illustration, assume one or more of the IP addresses assigned to the service by the cloud computing environment provider are cancelled/destroyed by the cloud computing environment provider. In this case, log entry data associated with the cloud infrastructure would indicate the event of the one or more IP addresses being cancelled. In addition, the log data for each of the virtual machine instances using the cancelled IP addresses would undoubtedly also include log entry data indicating the events of these resources dropping offline.
Using current systems, the log data for each of the virtual machine instances would not be correlated with the log data associated with the cloud infrastructure. Consequently, considerable time and energy could be expended to “discover” that the IP addresses associated with the virtual machine instances we destroyed at the infrastructure level and that this event was the cause of these virtual machine instances dropping offline. However, if the log entry data associated with the cloud infrastructure indicating the destruction of the IP addresses were correlated with the log entry data from the virtual machine instances indicating the instances dropped offline, it would be immediately apparent why the virtual machine instances dropped offline.
What is needed is a method and system for correlating, and/or supplementing, log entry data from two different log data sources in a cloud environment when one or more trigger events connecting the log entry data from the two different log data sources is detected.