Computer aided information retrieval out of historical manuscripts or other document types as well as from sequences of spoken text is still very difficult and limited.
A direct search based on sample sequences is a very slow procedure and does not generalize to other writing styles or other accents in speech. A search based on pre-transcribed computer code (e.g. ASCII) is fast, but it requires an expensive (time and human resources) and error-prone manual transcription process.
The document A. Graves, et al., “A Novel Connectionist System for Unconstrained Handwriting Recognition”, IEEE Transactions of pattern analysis and machine intelligence, vol. 31, no. 5, May 2009 discloses a method for recognizing unconstrained handwritten text. The approach is based on a recurrent neural network which is designed for sequence labeling task where the data is hard to segment and contains long-range bidirectional interdependencies.
Document US 2009/0077053 A1 discloses a method for searching a term in a set of ink data. The method includes an operation for converting ink data into intermediate data in an intermediate format in the form of at least one segmentation graph. Each node of the graph includes at least one ink segment associated with at least one assumption of correspondence with a recognition unit. The method further includes an operation for searching for the term carried out on the intermediate data.