Logs are used for understanding what went wrong with a computing system when errors or other fault conditions occur. Typical logging behavior includes writing a log message to a local text file immediately after an event occurs in a component (e.g., an application). Logs for components in an application cluster are used for understanding a failure in the application cluster. In general, a logging appliance gathers logs from many logging handlers distributed throughout the application cluster to simplify and formalize organization of all logs. Whether the gathering of logs is done by a logging appliance, or manually, often times support personnel are faced with a lack of detail about what failed components were doing when a problem first occurred. This is because the logs that are gathered are, generally, logs that only provide coarse information.
Consider that some logging handlers, which collect log messages locally from components, may collect log messages at different levels of detail. For example, fine-grain detailed logging is resource expensive because a volume and a rate of logging is dramatically higher than course grain logging, where logging generally occurs for fewer events such as warnings or severe errors. Fine grain logging can exceed course grain logging by many orders of magnitude in resource consumption.
Thus, administrators of an application cluster inevitably must decide what level of logging is acceptable. Tradeoffs between coarse logging with better performance and fine logging with better manageability must be made and, typically, manageability loses out. Accordingly, when an application cluster does fail, the lack of detail from coarse logging often makes understanding the failure difficult.