Phonetic speech search involves searching a database containing audio records for words and phrases by matching to a model of the expected possible sound patterns of the search term. This technique contrasts with speech-to-text (STT)-based approaches that search the output of a large-vocabulary speech recognizer. An advantage of the phonetic search approach is that the phonetic search approach is not constrained by vocabulary or recognition errors of any STT system. However, phonetic searching can suffer from false matches on similar-sounding but unwanted phrases. For example, searching for the word “contract” in speech that contains the word “contact” is likely to give false matches. If there is a large amount of audio containing similar-sounding but unwanted phrases, the extent of these false matches can lead to poor search results.
One way of addressing this problem is to try to specify sufficiently long search phrases to neutralize the effect of confusable words. For example, a search for “cancel my contract” may not give any false hits on “contact” if the word “contact” is not preceded by “cancel my.” Facilities for manually filtering and tagging results may also be offered. Unfortunately, these solutions are cumbersome and can result in certain relevant audio records not being returned because they do not exactly match the longer search phrase.
Accordingly, a need exists for an improved phonetic speech searching solution that avoids false matching problems, but also utilizes simple search strategies.