Real-time Log Analysis (“RTLA”) may allow an organization to monitor the service and error logs of a number of host computers and devices in near-real time in order to spot trends in service performance or customer demand as well as to troubleshoot potential problems. An RTLA system may collect log data from the host computers and devices, process and collate the collected data and analyze the collated data to generate service metrics. These metrics may then be published to host management systems, alarming and alerting services, reporting and graphing services and support services. The generated metrics may include fatal error counts/rates, page views, service availability, host access rates, hardware performance measures and the like. Management and support personnel may utilize the published metrics and processed and collated log data to be alerted to potential problems or failures, troubleshoot host or service problems, determine additional resources that need to be made available to meet growing demand, spot trends in service or product demand and the like.
In an RTLA system that monitors a large number of services and/or hosts, the high volume of log data collected, processed and analyzed may result in an unacceptable latency between the logging of events/errors and publishing of the related metrics. For example, in a system comprising tens of thousands of host computers, the RTLA system may collect and process multiple terabytes of log data daily, and may incur a latency between the logging of events/errors and the generation and publishing of the related metrics on the order of several minutes, such as 8 to 10 minutes. In addition, a sudden increase in log volume due to external events, such as a denial-of-service (“DoS”) attack or deployment of bad code, may further increase the latency in the RTLA system, delaying investigation and analysis of potential problems. Such a delay in investigation and resolution of problems may result in prolonged service unavailability, leading to significant loss of revenue, violation of service level agreements and the like.
It is with respect to these and other considerations that the disclosure made herein is presented.