Research and development focused on customer connections to and from the call center has been around for some time, and supervisors and managers will review call length and call frequency data from time-to-time to understand performance. Thanks to speech-to-text technologies, they may also search recordings of conversations for key words. However, so far the art has not found effective ways to support active feedback, supervision and review of conversations based on behavior, especially in real time and across distributed teams. In this context, the term behavior refers to how people speak, and specifically the tonal, pacing, mirroring and turn-taking measurements that describe how people come across to each other independently of the words that they use.
Providing an excellent customer experience in call centers has increasingly become strategic for enterprise, and there is an increasing understanding that how an agent comes across can affect conversation outcomes. The status quo in call center agent feedback and supervision is for supervisors to randomly select an agent-member conversation to listen to and provide commentary. Agents have few tools that readily support self-study and self-improvement. Supervisors have no way to track multiple agents, and few methods for intelligently selecting which conversation to listen in on. Many supervisors only review a single conversation a month for each of the agents in their team and thus have little information about how their agents are sounding while they are speaking with customers. This situation becomes even more challenging because call center teams are increasing dispersed, with many agents now working from home thus reducing a supervisor's ability to listen for tone and behavior by walking around an office space. At the same time, call center teams can also suffer from significant turnover, resulting in a high need for training.
Disclosed is a system for extracting and visualizing behavioral insight from speech interactions in real-time. The system provides a scalable solution for the real-time capture, transformation, and visualization of the non-verbal components of speech and voice interactions. Vocal signals, such as speaking rate, conversational flow, dynamism, vocal effort and events such as laughter and audible breathing are extracted from audio streams such as phone calls, transformed into quantitative values that change over time, and visualized in a dashboard display. The system also allows for the visualization of multiple live conversations simultaneously, allowing for the observation and monitoring of multiple individuals. For example, call center supervisors can monitor the conversations of many agents in real-time, and use the visualized information to drive their training, monitoring, and feedback processes. Agents can review their prior conversations, and evaluate both their own signal data as well as that of anyone they were speaking with, allowing for longitudinal analysis of customer reactions, interest and engagement. Furthermore, the system provides a gamification element, by visualizing in real-time individual and team progress against benchmarks, prior performance, team averages or other company dictated milestones. This drives agent and supervisor performance and engagement with their existing workflow, helping a company further achieve stated milestones. Overall, the system is designed to visualize vocal signal information to agents and supervisors, allowing them to utilize these changing metrics to inform their own decision making process. In this sense, the system provides situational awareness for speech interactions.