The present disclosure generally relates to textual terms recognition, and more specifically to speech recognition using a language model adapted with external data.
Some attempts to extend data for training a model for speech recognition are known in the art, as exemplified in the following publications.
WO1999/050830 which reports language model used in a speech recognition which has access to a first smaller data store and a second, larger data store. The language model is adapted by formulation an information retrieval query based on information contained in the first data store and querying the second data store. Information retrieved from the second data store used in adapting the language model.
EP2273490 reports a speech recognition device that may adapt or otherwise modify a generic language model based on a retrieved corpus of text.
WO2006/099621 reports forming and/or improving a language model based on data from a large collection of documents, such as Web data. The collection of documents is queried using queries that are formed from the language model. The language model is subsequently improved using the information thus obtained and the improvement is used to improve the query.