Users work with media (e.g., images, handwriting, and scanned documents) that include text. An optical character recognition computing device system (“OCR system”) may execute operations that convert different types of text to electronic text format, such that the text may be edited, stored, searched or displayed. Optical character recognition can be implemented in different contexts for a variety of inputs. For example, data entry, automated number plate recognition, converting handwriting, and so on, with each context providing different types of challenges and opportunities for OCR systems. With the ever-increasing use of computing devices, improvements in OCR system functionality in extracting relevant text elements can make them more reliable. As such, developing new operations in OCR systems can help provide more efficient processing and more accurate OCR results.
Users also work with audio in different capacities (e.g., music, communications, video) directed to particular end users or audiences. Digital audio, in particular, is audio that is recorded or converted to digital form, where the sound waves, of an audio signal, are digitally encoded. A speech recognition computing system (“speech recognition system”) may operate with digital audio, in that spoken language may be converted into digital form to identify text (e.g., “speech-to-text”). An acoustic model and other components of a speech recognition system may be used to support the translation of speech, sometimes trained for a particular user's voice, to text.
With the growth of voice recognition based systems, improvements in speech recognition system operations in recognizing speech may make voice-recognition-based systems more dependable. As such, improvement of existing features in speech recognition systems can help improve computer functions in providing speech recognition functionality.