Large-scale World Wide Web (“Web”) services commonly utilize many thousands of server computers and/or virtual machine instances (which may be referred to herein as “service hosts”) to service client requests. In such a large-scale service, it is common for the service hosts to generate log files (“logs”) that include data describing various aspects of their operation. For example, service hosts might create service logs containing data describing aspects of the processing of client requests, performance logs containing data describing one or more performance characteristics of the service hosts, and error logs containing data describing errors generated by the service hosts. The service hosts might also generate other types of logs containing other types of information.
The volume of log files generated can be enormous when, as described above, multiple thousands of service hosts are utilized to implement a service. For example, if a large-scale service is implemented using several thousand service hosts, it would not be unusual for the service hosts to generate several hundred gigabytes (“GB”) of log files per hour. It can be extremely time consuming to locate data of interest in such a large set of data. This can be particularly frustrating for an administrator of such a large-scale service when quick access to data in the log files is needed to assist with addressing a problem condition.
It is with respect to these and other considerations that the disclosure made herein is presented.