The subject matter discussed in the background section should not be assumed to be prior art merely as a result of its mention in the background section. Similarly, a problem mentioned in the background section or associated with the subject matter of the background section should not be assumed to have been previously recognized in the prior art. The subject matter in the background section merely represents different approaches, which in and of themselves may also be inventions.
A computing system can serve multiple users by executing software applications. An event can be when a system user uses an application, such as logins to and logouts from an application, page requests/loads/views, record accesses, file and report downloads and exports, clicks on an application's user interface button, and uses of corresponding application programming interfaces (APIs). The system can respond to such events by generating data that is saved in log files. For example, if a user's client device downloads a file, a system logger stores a corresponding log entry in a log file. The log entry can include data such as a user identifier, a download event type, a timestamp when the download occurred, the name of the downloaded file, and internal system information, such as the bandwidth used by the system to provide the download. If another client of another user logs into the application, the logger can store a new log entry in a log file. For example, if a client logs into an application, the logger stores a new log entry in the same log file or in another log file. The new log entry can include data such as the user identifier, a login event type, a geographic location from which the client logged into the application, a timestamp when the login occurred, and internal system information, such as a server load associated with the login.
A system administrator can use log files to understand the activities of the system and to debug problems. Some end users may want to view some of the data in the log files to perform their own data analysis. For example, a company's director is interested in the log data related to logins, downloads, and user interface clicks, in order to analyze how and where the company's employees are accessing their applications, what the employees are downloading, and what features are being used by the employees. Therefore, the system parses through a log file generated for an application, identifies log data for the system administrator, and identifies end user-facing log data for some of the users. For example, the director maps the geographical locations from where the employees' clients are downloading files, and identifies a problem by determining that a confidential file was downloaded to an unsecure location.
Since a logger, or loggers, use different schemas to generate different log entries for different applications, the schemas may be analyzed to derive each schema for each log entry type. A schema and metadata refer to the same entity or concept in this disclosure. For example, a system administrator analyzes the system's schemas to derive a customer relationship management (CRM) schema, which is identified by a CRM log code in a log entry, and which a CRM logger uses to generate CRM log entries for a CRM application. When a system user requests specific CRM data, a log processor identifies log entries that include the CRM log code, uses the derived CRM schema to determine that the requested CRM data is located at the 5th location of a delimited text data format (such as comma separated values) in each identified CRM log entry, extracts the requested CRM data from the 5th location of the identified CRM log entries, and outputs the requested CRM data to the system user.