This invention relates to speech recognition apparatus, particularly but not exclusively to speech recognition apparatus which receive a plurality of related speech signals and use the received speech signals to identify an entry contained in a database.
In an operator based telesales service which requires the user to provide an address, the postcode is often requested. The postcode is used to access an address database and to identify an entry from the address database corresponding to the postcode provided. For unique identification of the required address amongst the subset of addresses all having that postcode in common, provision of a house name or number is all that is required. However, in automated systems for retrieving an address from an address database the accuracy of postcode recognition alone is not sufficiently accurate. For example, the accuracy for a postcode recogniser has been reported to be as low as 66% when speech recognition is performed on speech received from a telephone network. Therefore a more extensive dialogue requesting more information from the user is required. If a service is interactive then any uncertainty about whether a recognition result is correct may be dealt with by asking the user to confirm that the recognised utterance is correct. However if the service is offline then the speech recognition apparatus must make the best use of all the information it has. For example, in a service which requires an entry in a database to be identified this information will be any speech signals the speech recognition apparatus has received from the user and the information in the database regarding valid entries in the database. In a customer account database, for example, the user may provide speech signals representing their name and their account number. A speech recognition process is performed both on the speech signal representing the name and on the speech signal representing the account number, then the recognised name and account number may be compared with the entries in the database. If the recognised name and account number do not provide a valid entry then the identification of an entry is considered to have failed.
According to the present invention there is provided a speech recognition apparatus comprising input means for receiving a speech signal; recognition means coupled to the input means and arranged to provide a first set of one or more items falling within a first vocabulary, the items being derived from a first received speech signal; and provide a second set of one or more items falling within a second vocabulary, the items being derived from a second received speech signal; and comparison means arranged to perform an intersection of the first and second set whereby the combined set comprises items which fall within both the first set and items which fall within the second set; provide a resulting combined set of items; and provide as an output a grading signal in dependence upon the number of items which fall within the combined set.
According to another aspect of the invention there is also provided a speech recognition apparatus comprising input means for receiving a speech signal; recognition means coupled to the input means and arranged to provide a first set of one or more items falling within a first vocabulary, the items being derived from a first received speech signal; and provide a second set of one or more items falling within a second vocabulary, the items being derived from a second received speech signal; and comparison means arranged to perform a union of the first and second set whereby the combined set comprises items which fall within the first set of items or items which fall within the second set; and provide a resulting combined set of items.
The recognition means may be further arranged to generate an output set of items falling within the combined set of items, the output set derived from a third received speech signal.
Alternatively the first received signal is different from the second received signal and the first set may be derived from the first received signal by generating an intermediate set of items falling within an intermediate vocabulary comprising items in a first field of the database, the intermediate set of items corresponding to the first received speech signal; the first set of items comprising items in a second field of an entry in the database which have items from the intermediate set in the first field of the entry.
The second set of items may be similarly derived from the second received signal. The third set of items may be derived similarly in embodiments of the invention utilising a third received signal.
The size of the first and second sets may be limited to a predetermined number of items prior to comparison.