A log file, or simply a log, is a file that records events which have occurred during execution of a computer system or during the execution of a file. The purpose of a log file is to provide data which may be used to understand activity that occurred during the execution of the computer system and to diagnose problems with applications or an operating system running on the computer system. Logs may comprise machine-generated data that are generated by internet protocol (“IP”) enabled end-points or devices like web logs, network events, call data records, and RFID information.
Most log files comprise raw unstructured data. Raw unstructured data refers to information that does not have a predefined data model and has not been analyzed. Raw unstructured data, such as application logs and web logs, may be text-heavy but may also contain data like dates, numbers, facts and master data. However, since log files comprise unstructured data, it is challenging to obtain useful information that is embedded in large log files comprising unstructured data.