Entertainment systems often utilize speech recognition techniques in order to support processing user queries in their natural language and to provide a smoother user interface. For example, instead of a user manually searching for Tom Cruise movies by selecting an “actor” search option and typing “Tom Cruise” using a keyboard, the user may simply say “Show me Tom Cruise movies.” An interactive media guidance application may recognize the query using speech and/or voice recognition techniques, parse the words, and interpret the meaning of the query using syntactic and semantic information. However, traditional speech recognition techniques are limited to processing a single recognition hypothesis of a search query generated by an automatic speech recognition module. The single recognition hypothesis may contain errors and mischaracterize the user input, which can lead to erroneous search results or propagate to other system components that make use of the erroneous recognition hypothesis.