Text to text applications include machine translation, automated summarization, question answering, and other similar applications where a machine carries out the function of understanding some kind of input information and generating text. The input information is often “text”, but can be any kind of information that is received and understandable by the machine.
Conventional text to text applications use heterogeneous methods for implementing the generation phase. Machine translation often produces sentences using application-specific decoders that are based on work that was conducted on speech recognition. Automated summarization produces abstracts using task specific strategies.
Text to text applications have struggled with use of generic natural language generation (NLG) systems, because they typically do not have access to the kind of information required by the formalisms of natural language generation systems. For example, natural language generation may require formalisms such as semantic representations, syntactic relations, and lexical dependencies. The formalisms also require information that is obtained from deep semantic relations, as well as shallow semantic relations and lexical dependency relation. Machine systems typically do not have access to deep subject verb or verb object relations.
A number of natural language generation systems are known, including FUF, Nitrogen, HALogen, and Fergus.
The formal language of IDL (Interleave; Disjunction; Lock) was proposed by Mark-Jan Nederhof and Giorgio Satta in their 2004 paper: IDLexpressions: a formalism for representing and parsing finite languages in natural language processing. Journal of Artificial Intelligence Research, 21: 287-317. Using IDL expressions, one can compactly represent word- and phrase-based encoded meanings. Nederhof and Satta also present algorithms for intersecting IDL expressions with non-probabilistic context free grammars.