Spoken dialog systems (SDSs) consist of multiple subsystems, such as automatic speech recognizers (ASRs), spoken language understanding (SLU) modules, dialog managers (DMs), and spoken language generators, among others, interacting synergistically and often in real time. Each of these subsystems is complex and brings with it design challenges and open research questions in its own right. Rapidly bootstrapping a complete, working dialog system from scratch is therefore a challenge of considerable magnitude. Apart from the issues involved in training reasonably accurate models for ASR and SLU that work well in the domain of operation in real time, one should review that the individual systems also work well in sequence such that the overall SDS performance does not suffer and provides an effective interaction with interlocutors who call into the system.
The ability to rapidly prototype and develop such SDSs is important for applications in the educational domain. For example, in automated conversational assessment, test developers might design several conversational items, each in a slightly different domain or subject area. One can, in such situations, be able to rapidly develop models and capabilities to ensure that the SDS can handle each of these diverse conversational applications gracefully. This is also true in the case of learning applications and so-called formative assessments: One should be able to quickly and accurately bootstrap SDSs that can respond to a wide variety of learner inputs across domains and contexts. Language learning and assessments add yet another complication in that systems need to deal gracefully with non-native speech. Despite these challenges, the increasing demand for non-native conversational learning and assessment applications makes this avenue of research an important one to pursue; however, this requires us to find a way to rapidly obtain data for model building and refinement in an iterative cycle.