1. Field of the Invention
The present invention relates in general to the field of computers and similar technologies, and in particular to software utilized in this field. Still more particularly, it relates to a method, system and computer-usable medium for identifying question-answer pair associations within a dialog.
2. Description of the Related Art
Dialogs between two or more individuals are a part of everyday life and it is common to transcribe certain of these conversations into textual form for various reasons. For example, a transcription of a meeting may contain questions asked by various participants and responses provided by others during the course of one or more dialogs. Such responses may contain answers to a given question, but they may also contain verbiage that may not contain any relevant information. Furthermore, a participant's response may be a question in return. As another example, call center transcriptions may capture a dialog between a customer and an agent. These dialogs may be related to a particular product or a service and will typically contain a variety of questions and answers. As yet another example, an earnings transcription is typically generated when a CEO or CFO announces their company's earnings. These transcriptions not only contain dialogs between the company's representatives and people on the call or at the event, but also various kinds of questions and answers.
In the preceding examples, questions and their corresponding answers typically occur in an unstructured form within a dialog, which can present challenges in identifying associated question-answer pairs. For example, certain questions may not contain any context, or the questions and answers may be very short in length as they represent real-life, informal, conversational exchanges. Furthermore, the questions and their corresponding answers may not be proximate to each other within the dialog. Moreover, the dialog may have several different questions which are answered, but the answers are not particularly important or do not contain significant information.
Known approaches to these challenges include processing textual content to generate fact-based questions and answers, which are then combined to provide associated question-answer pairs. In certain of these approaches, the questions need not be represented in the text as they are generated from facts that are present within the text as knowledge. Accordingly, the resulting question-answer pairs provide a way of representing a knowledge domain in the form of questions and their corresponding answers. Other known approaches are implemented for a given domain. In these approaches, textual content pertaining to a particular domain is processed to generate questions, which if answered, provide a summary of the textual content. For example, one such approach takes an insurance claim as input and generates questions one should ask about the claim to be able to verify whether the claim is true or not. However, none of these approaches provide a way to identify related question and answer pairs that are embedded within a dialog.