Voice recognition systems capable of detecting audio speech signals and converting them into related text have been deployed to allow users to interact with computers through voice commands. For example, voice recognition systems have been used to automate the answering and processing of customer service calls.
Optical Character Recognition (OCR) systems capable of extracting text from images have been deployed to facilitate copying and searching of text stored in image files. For example, OCR systems have been used to extract text from images stored as portable document files (PDFs).
Text recognition systems (e.g., voice recognition systems or OCR systems) may use machine learning models that are trained using a large set of training data. For example, the training data may include audio voice signals and paired text equivalents or labels. Generating the data set used to train a text recognition system may be expensive.