XML, the Extensible Markup Language, is a general-purpose specification for creating custom markup languages. XML is said to be extensible in that it has no predetermined format, but rather is a meta-language for describing other languages. Thus, XML enables users to design customized markup languages for various types of documents and to define user-specified elements. XML is used to encode documents and to serialize data. One of the benefits provided by XML is that it facilitates the sharing of structured data across different information systems, especially via the Internet.
XML encompasses XML-based speech synthesis application, such as the Speech Synthesis Markup Language (SSML) based on the recommendation of the World Wide Web Consortium (W3C) voice browser working group. XML-based speech synthesis applications use the XML format to annotate text input to a speech synthesizer. Using defined elements to annotate the text input, an application developer can specify the structure of a document comprising the text input, provide pronunciations of words and phrases, indicate phrasing, emphasis, pitch and speaking rate, and control other critical speech characteristics. In the absence of annotation using SSML or another XML-based speech synthesis application, a speech synthesizer alone must be relied on to interpret the text input in generating a synthesized speech output.
Not withstanding the distinct advantages proved by an XML-based speech synthesis application, using such an application can be tedious. Often times a number of repetitive tasks are involved in creating or editing marked up text for input to a text-to-speech system using the tags and rules specified by a SSML or other XML-based speech synthesis application.