The amount of data generated by various machines (e.g., appliances, servers, software tools, etc.) connected in an organization is enormous. The machine-generated data may be in a structured textual format, an unstructured textual format, or a combination thereof. Examples for such machine-generated textual data include logs, metrics, configuration files, messages, spreadsheets, events, alerts, sensory signals, audit records, database tables, and so on.
A typical IT process proceeds as follows: data gathering begins when an end user contacts an IT professional to report an issue. The reported issue is described in the form of symptoms demonstrated from the user's perspective such as, for example, an identifier of an error message, inability to access a service, inability to start a system or program, and the like. For example, the user may call or email an IT professional to say that the workstation takes very long to log in, or may do so via a service portal. A ticket is created in an IT service management (ITSM) system for the issue. Tickets in the ITSM system are investigated and resolved.
The existing solutions include manually applying playbooks using human operators based on user-provided information for ITSM systems have various disadvantages. First, an issue is often reported in an open-ended problem that may be the result of a long, branching series of events. For example, a user report indicating that the user interface of an application is unusually slow may have multiple branching potential causes. Second, problems are often indicated by metrics (e.g., CPU, RAM, disk, network, etc.) that provide poor information regarding the problem itself. Third, the point at which a human operator may start investigating an issue is often far removed from the root cause such that determining the root cause is practically impossible. Due to these and other challenges, playbooks for solving incidents usually focus on gathering data and improving the starting point of the investigation, rather than solving the problem that caused the issue.
Various disadvantages of the existing solutions are caused at least in part by reliance on user-provided information for addressing issues. Specifically, the user-provided information may include inaccurate descriptions of issues provided by users, particularly when the symptoms of those issues are only loosely related to the underlying root causes. Further, different users may provide varying descriptions for the same symptom, which may result in symptoms that are essentially the same being addressed differently. Thus, any relationships between issues and descriptions thereof ultimately rely on manual inputs and, thus, are often inaccurate. This leads to misclassification of issues and, therefore, incorrect recommendations.
Another issue with existing solutions for addressing IT problems is that the urgency of issues may be determined to be different at different times of day due to different numbers of users reporting issues. However, these different numbers of users may be caused by, for example, variations in the total amount of users accessing services at different times of day.
Additionally, existing solutions often prove inconvenient for end users since they require users to submit information for tickets either personally or via a subordinate. Thus, a user typically needs to call, email, or visit a web portal in order to report an issue. Accordingly, issues often proceed unaddressed until users begin reporting the issues. If the user is not available to report the problem, the reporting may be delayed, thereby resulting in delaying the correction of the problem since the IT department is not aware of the issue.
Further, the options for reporting offered by a particular IT department may not be preferable to some users. For example, some IT departments may only provide the option of reporting issues using text provided through a web portal while the user would prefer to speak to a live representative. Further, the reporting itself may prove inconvenient, particularly when a user calls to report and there is a wait due to a high number of other users reporting simultaneously.
It would therefore be advantageous to provide a solution that would overcome the deficiencies of the prior art.