When developing software, programmers usually add log statements to the source code, which later provide information about the execution of the program. This information is recorded in one or more log files during the execution of the program and can be viewed and/or analyzed using a number of tools. Log files are valuable data sources for debugging. The troubleshooting time of production failures is reduced when log files are available. In cases where it is difficult to reproduce production failures, logs are sometimes the only available resources to aid problem diagnosis. This is especially true for critical bugs which require fast resolution, and the time spent for reproducing such bugs should be minimized.
Currently, there is no accepted industry-wide standard for logging, and it is often done in an arbitrary manner. There exist a number of libraries which aid logging, but such log libraries offer little help in standardizing the logging practice. For example, software projects built using Java as the main programming language often use log 4j as the logging library.
Logging libraries frequently associate a verbosity level with each of the log printing statements. For example, log 4j has built-in verbosity levels such as TRACE, DEBUG, INFO, WARN, ERROR and FATAL; with the TRACE verbosity level being assigned the lowest rank, and FATAL being assigned the highest rank. The advantage of verbosity level is that the library can be configured to emit only a subset of log statements without recompiling the code or modifying the log printing statements. For example, if the root logger verbosity level in log 4j is set to INFO, only the log statements with verbosity level of INFO or above (i.e., ERROR and FATAL) are logged.
Due to the arbitrary nature of logging, new programmers in a project often find it difficult to understand the ideal logging behavior to be followed for the project. The logging behavior in this context refers to various aspects of logging including the density of log statements, verbosity level assignment, diagnostic context in the log messages, and the like. If not properly understood, improper logging behavior may lead to lack of log statements, less contextual data, unnecessary log statements and improper verbosity levels. The lack of log statements and less contextual data could make the production failure diagnosis difficult.
Developers are often confused about what to log. The decision may be simple in the case of error conditions or exceptions. But with other parts of the code, even guidelines offer little help. This results in missing or unnecessary log statements. When confused, developers tend to add more log printing statements than necessary since it is considered to be safer than missing vital data. But unnecessary log statements will result in fast roll-over of log files, thereby losing valuable diagnostic data. It also creates visual clutter, and can confuse the developer during debugging.
One problem with improper verbosity level assignment is that if the rank of a log statement's verbosity level is higher than it should be, it would create noise in the logs and result in fast rollover. If the rank is lesser than it should be, it won't be logged if the configured verbosity level for logging has a higher rank. Given all of the difficulties in understanding the logging behavior, it would be beneficial to provide some guidance to software developers about the ideal logging behavior that should be followed for each project.