(Not Applicable)
(Not Applicable)
1. Technical Field
This invention relates to the field of speech recognition software, and more particularly, to using a speech cache in conjunction with a speech recognition `application to improve system performance.
2. Description of the Related Art
Speech recognition is the process by which an acoustic signal received by microphone is converted to a set of text words by a computer. These recognized words may then be used in a variety of computer software applications for purposes such as document preparation, data entry, and command and control. Improvements to speech dictation systems provide an important way to enhance user productivity.
Some speech recognition applications cannot recognize a user spoken utterance identifying a word or a word phrase without the aid of attributes. This is particularly true of embedded speech recognition applications having limited vocabularies, such as the variety used in navigation systems in automobiles. Attributes provide the speech recognition system with supplemental information detailing the user spoken utterance. Oftentimes, for such a speech recognition system to recognize a user spoken utterance, the user must also issue a lengthy series of attributes. For example, if the user utters a phrase recognized as a speech command such as xe2x80x9chow farxe2x80x9d in conjunction with a speech object, xe2x80x9cRoller Coaster Worldxe2x80x9d, then the system may require attributes identifying the object within the speech command. In this case, to properly identify the object xe2x80x9cRoller Coaster Worldxe2x80x9d to the speech recognition system, attributes such as xe2x80x9cU.S.A.xe2x80x9d, xe2x80x9cFloridaxe2x80x9d, xe2x80x9cOrlandoxe2x80x9d, and xe2x80x9camusement park named Roller Coaster Worldxe2x80x9d may be necessary. Using attributes to specify a speech object within a speech command can be analogized to navigating through a system of computer directories to find a particular computer file.
Even more troublesome is the case when the user repeatedly issues the same speech command or issues a series of subsequent speech commands involving the same object. This situation commonly occurs in the case of a user driving to a distant location for vacation. Conventional systems do not store previously issued speech commands, objects, or attributes. Thus, each time the user issues a speech command regarding a previously identified object, the user must also provide the system with the previously mentioned attributes identifying the object. For example, each user command requesting information such as the distance or route to xe2x80x9cRoller Coaster Worldxe2x80x9d must be accompanied by the lengthy list of attributes identifying xe2x80x9cRoller Coaster Worldxe2x80x9d to the system. A significant amount of time and efficiency could be gained if speech recognition systems having a limited vocabulary could more efficiently recall previously used commands or objects.
The invention disclosed herein for improving system performance of speech systems in accordance with the inventive arrangements satisfies the long-felt need of the prior art by using a speech cache and speech cache logic in conjunction with the recognition system. Such speech systems can recall particular objects or speech commands from a speech cache, thereby eliminating the need for users to continually utter redundant attributes to the speech system in an effort to properly describe a speech object. Because speech systems frequently use a set of commands or objects, the cache is a cost effective method of enhancing memory systems using statistical means, without having to resort to the expense of making the whole memory system faster.
The invention concerns a method and a system for improving recall of speech data in a computer speech system. Significantly, the speech system can be an embedded computer speech system. The method of the invention involves a plurality of speech cache management steps including providing a speech cache; receiving a speech system input and identifying a speech event in the received speech system input, the speech event comprising speech data. Subsequently, the speech data can be compared to pre-determined speech cache entry criteria; and, if the speech data meets one of the pre-determined entry criteria, at least one entry can be added to the speech cache, the at least one entry corresponding to the speech data. Additionally, the speech data can be compared to pre-determined speech cache exit criteria; and, if the speech data meets one of the pre-determined exit criteria, at least one entry can be purged from the speech cache, the at least one entry corresponding to the speech data.
In the preferred embodiment, the entry criteria comprises frequently used speech data, recently used speech data, and important speech data. Similarly, the exit criteria can comprise least frequently used speech data associated with each entry in the speech cache, least recently used speech data associated with each entry in the speech cache and least important speed data associated with each entry in the speech cache.
The method of the invention can also include a speech cache filtering process. Specifically, an embodiment incorporating speech cache filtering can compare entries in the speech cache with filtering criteria; and, sort the entries according to the filtering criteria. The filtering criteria can comprise frequency of use of speech data associated with each entry in the speech cache, least recency of use of speech data associated with each entry in the speech cache, and importance of use of speech data associated with each entry in the speech cache.
In the preferred embodiment, the speech system input can be one of a system event and a speech event. To accommodate system events, the method of the invention can further establish a table of system events and corresponding speech cache commands. Responsive to receiving a system event, the received system event can be compared to the system events in the table. If the received system event matches a system event in the table, the speech cache command corresponding to the matching system event in the table can be performed. Notably, the corresponding speech cache commands can include purge commands and add commands.
In the preferred embodiment, the comparing step can comprise evaluating the speech system input against user-configurable rules for adding and deleting from the speech cache entries corresponding to the speech data, the rules based on frequency of use of the speech data, recency of use of the speech data and importance of use of the speech data. Similarly, the comparing step can comprise the step of evaluating the speech system input against system configured rules for adding and deleting from the speech cache entries corresponding to the speech data, the rules based on a pre-specified list of speech data. Significantly, comparisons performed against the system-specified entry and exit criteria can be overridden with the comparisons performed against the user-specified entry and exit criteria
In a preferred embodiment, the method of the invention can further include establishing a frequency counter for the speech data. Responsive to receiving a speech event, the frequency counter corresponding to the speech data can be incremented. Thus, the comparing step can comprise evaluating the speech system input against user-configurable rules for adding and deleting from the speech cache entries corresponding to the speech data, the rules based on frequency of use of the speech data. In that case, the frequency can be measured by the frequency counter established for the speech data. Moreover, the adding step cans be performed in response to a frequency indicated by the frequency counter exceeding a pre-determined threshold. Likewise, the deleting step can be performed in response to a frequency indicated by the frequency counter falling below a pre-determined threshold.
Advantageously, the method can further comprise the steps of sensing when the speech cache is full; and, responsive to sensing a full speech cache, purging entries from the speech cache according to pre-determined purging criteria. Notably, like the exit criteria, the purging criteria can include least frequently used speech data associated with each entry in the speech cache, least recently used speech data associated with each entry in the speech cache and least important speed data associated with each entry in the speech cache. The purging step can include the steps of: displaying a list of speech cache entries sorted according to the purging criteria; accepting confirmation from a user before purging entries in the speech cache selected for purging based on the purging criteria; and, in response to receiving the confirmation, purging the selected speech cache entries.
Finally, in an alternative embodiment of the present invention, the method can include the steps of associating expiration data with at least one entry in the speech cache; and purging the associated entries in the speech cache according to the expiration data. In the alternative embodiment, the associating step can comprise the steps of accepting user-specified expiration data; and, associating the user-specified expiration data with at least one user specified entry in the speech cache.
According to a second aspect, the invention can be a computer speech system for managing a speech cache. Significantly, the speech system can be adapted for use in a vehicle. Moreover, the speech system can be adapted for use in a vehicle navigation system. In the second aspect of the invention, the system can comprise: a speech enabled application where the speech enabled application is coupled to a speech recognition engine and the speech enabled application and the speech recognition engine are configured to process speech data. Also included is a speech cache for storing entries corresponding to the speech data and predetermined speech cache entry and exit criteria. The entry criteria specify rules for adding entries corresponding to the speech data to the speech cache. Similarly, the exit criteria specify rules for purging entries corresponding to the speech data from the speech cache. Finally, the system can include speech cache logic for comparing the speech data to the pre-determined entry and exit criteria. The speech cache logic can add to the speech cache at least one entry corresponding to speech data meeting the pre-determined entry criteria. Likewise, the speech cache logic can purge from the speech cache at least one entry corresponding to speech data meeting the pre-determined exit criteria.
In the preferred embodiment, the speech cache is a circular cache. Moreover, the entries in the speech cache comprise speech commands, speech objects, pointers to speech commands and pointers to speech objects. The entries can further comprise at least one entry having corresponding expiration data. In that instance, the speech cache logic can purge the at least one entry having corresponding expiration data according to the expiration data.
In the preferred embodiment, the speech cache logic is adapted to receive system events in the speech system. The speech cache logic can further include a table of system events and corresponding speech cache commands. The speech cache logic can be adapted to perform a speech cache command in response to receiving a corresponding system event. Notably, the pre-determined entry and exit criteria can include a speech cache command, frequency of use of the speech data, recency of use of the speech data and importance of use of the speech data. Moreover, the speech cache command can include an add command and a purge command.
The system can further include pre-determined purging criteria. Where the system includes purging criteria, the speech cache logic, in response to receiving the purge command, can purge entries in the speech cache according to the purging criteria. Like the pre-determined exit criteria, the pre-determined purging criteria can include frequency of use of the speech data corresponding to the entries in the speech cache, recency of use of the speech data corresponding to the entries in the speech cache and importance of use of the speech data corresponding to the entries in the speech cache. Finally, a system in accordance with the inventive arrangements which incorporates purging criteria can further include a display for displaying to a user a list of entries in the speech cache selected for purging based on the purging criteria. As such, the system can confirm the purge command before purging the selected entries.
The speech cache logic can further comprise filtering logic for sorting the entries in the cache according to pre-determined filtering criteria. Like the entry criteria, the filtering criteria can comprise frequency of use of the speech data corresponding to the entries, recency of use of the speech data corresponding to the entries and importance of use of the speech data corresponding to the entries. Notably, in the preferred embodiment, the speech cache logic can further comprise at least one incrementable frequency counter corresponding to particular speech data. In consequence, the frequency of use of the particular speech data can be indicated by the frequency counter. In the preferred embodiment, the frequency counter can be incremented in response to the speech cache logic receiving an instance of the particular speech data from the speech system.
According to a third aspect, the invention may comprise a machine readable storage, having stored thereon a computer program having a plurality of code sections executable by a machine for causing the machine to perform the above-described method of the invention.