Identifying recorded interactions, e.g., in a call center (where groups of service representatives interact with users or customers, for example by telephone but also by other communications methods) is a challenge faced by the industry. For example, identifying recorded interactions related to a specific product, service or issue can help a contact center better serve clients. Various systems and methods for identifying and/or categorizing recorded interactions, are known, e.g., categorizing recorded interactions based on phrases that appear in the interactions. An interaction may be for example a conversational exchange between one or more people, e.g. a verbal conversation, a conversation via e-mail, or a conversation via text message. Such interactions may be recorded, e.g., by audio recording, recordings of texts, etc.
Generally, the process of building phrase based categories (that may be used to identify interaction) as done by known systems and methods, is a long, tedious, and mostly manual effort that requires substantial knowledge and training. For example, in order to identify or categorize interactions, users (e.g., experts or other employees in a contact center) must run many queries and listen to calls to identify particular phrases that best represent the topic of the category that is being constructed. For each phrase identified, a user must determine how well the phrase contributes to the category by once again listening to many calls. After a category in constructed, the user must use sampling to best estimate the optimal accuracy of the category so that it will lower the amount of false positives, while increasing the amount of interactions it can identify.
Consequently, resulting categories produced by known systems and methods are less than ideal, e.g., due to the lack of expertise of categorization of the employee who handles the categorization and due to human errors. Other issues that further aggravate the problem may be imperfect phonetic or transcription engines (the output of which is used by employees when categorizing interactions), human inability to identify all relevant phrases and/or inability to correctly optimize detection versus accuracy. Moreover, time and money spent in the process may be substantial. Accordingly, while categorizing interactions is highly desirable in the industry, efficiently categorizing interactions is a challenge faced by the industry.