1. Field of the Invention
The invention disclosed herein relates generally to electronic switching devices which also provide for biometric acquisition and comparison of human voice patterns, in order to distinguish between individuals and perform a switching action or command by other digital communication means that communicates with and controls another external device when a successful match is made between a live utterance sample and stored biometric patterns.
2. Description of Related Art
The following patents form a background for the instant invention. None of the cited publications is believed to detract from the patentability of the claimed invention.
Hair et al., U.S. Pat. No. 3,673,331, describes voice verification that is accomplished at a plurality of spaced apart facilities each having a plurality of terminals. Multiplexing structure interconnects the terminals through a communications link to a central processing station. Analog reproductions of voices transmitted from the terminals are converted into digital signals. The digital signals are transformed into the frequency domain at the central processing station. Predetermined features of the transformed signals are compared with stored predetermined features of each voice to be verified. A verify or non-verify signal is then transmitted to the particular terminal in response to the comparison of the predetermined features. The principal disadvantage is that it cannot operate in a self contained device and requires a network of terminals and a central processing station for authentication of an individual.
Kishi et al., U.S. Pat. No. 4,450,545, describes a voice responsive door lock system for a motor vehicle wherein the operation of the door lock device is vocally controlled by the driver. The voice responsive door lock system for a motor vehicle comprises a door position detection means, an indication means for indicating a question as to the necessity of locking the door, a voice recognition unit for identifying the driver's reply and producing a door lock command signal, and a door lock control means for actuating a door lock device upon receiving the door lock command signal. This teaches a method for locking a door with a voice command, but has the disadvantage of using a speaker independent voice recognition technology instead of biometric speaker verification technology making it unsuitable for unlocking the previously locked door from outside the secured space.
Cavazza et al., U.S. Pat. No. 4,752,958, discloses a device that obtains several characteristic parameters from a standard sentence said by a speaker and compares them with average parameters of the same speaker stored in an internal memory and previously calculated. According to the comparison, it obtains a probability value that the sentence spoken belongs to that speaker and compares the value with a threshold normalized to the average parameter variance by a threshold calculation circuit. If the threshold is overcome, the device considers the speaker verified. A circuit determines the real instants of sentence beginning and end using a noise-adaptive threshold in order to limit between these two instants the time interval over which characteristic parameters are to be calculated. A circuit aligns as to time the characteristic parameters just calculated to the parameters of a reference sentence, obtaining standard lengths of the sounds composing the sentence spoken. A variable probability threshold is controlled by the standard deviations of the histogram of the average of the characteristic parameter vectors. The disadvantage of a cadence based voice reference pattern is that it is not suitable for rejecting random background noise since it does not account for differences in the frequency domain, thus simply humming in the correct cadence would yield a positive result even though the correct phrase was not spoken.
Muroi et al., U.S. Pat. No. 4,833,713, describes a voice or sound recognition system including a microphone for converting a voice into an electrical voice signal, a frequency analyzer for generating a voice pattern in the form of a time-frequency distribution, and a matching unit for matching the voice pattern with registered voice patterns. This is one of many methods published for biometric acquisition and matching system, but does not define a method or apparatus for managing multiple commands and individuals or translating the verification results to an appropriate and useful output result.
Schneider et al., U.S. Pat. No. 4,856,072, describes a voice actuated vehicle security system that includes both internal and external microphones for receiving vocal instructions and internal and external speakers for delivering vocal messages. During a training period, a plurality of voice recognition templates are stored in memory representing one or more authorized vehicle operators. A voice recognition and synthesis unit interfaces the microphones and the speakers with a microcomputer to permit the system to respond to changes in vehicle conditions by delivering the associated vocal messages and to respond to vocal instructions to control vehicle elements such as door locks, lights, etc. The principal disadvantage is that the device is not useful for controlling a wide variety of external devices, but rather only the vehicle security system that contains the biometric acquisition module.
Parra, U.S. Pat. No. 5,313,556, describes a method for determining the identity of an individual (known or unknown) by a sonic profile of sounds issued through his oral-nasal passages. The sounds are converted to digital electrical signals and produce a three domain format of frequency, amplitude and time samples to produce an array of peaks and valleys constituting the sonic profile of an individual. A source or library of sonic profiles in the same format of a known individual have a interrelationship including relative positions of said peaks and valleys of said sonic profile of the known individual with that of said unknown individual compared and a utilization signal is provided upon detecting or non-detecting a correlation between said sonic profiles. The disadvantage of this system is that it requires the individual to carry a magnetic card containing the identity of the individual which can be misplaced rendering the system unusable for that individual. It also requires an intermediate manual step to translate an audio recording into a usable reference pattern.
Hattori, U.S. Pat. No. 6,094,632, discloses a speaker recognition device for judging whether or not an unknown speaker is an authentic registered speaker himself/herself executes ‘text verification using speaker independent speech recognition’ and ‘speaker verification by comparison with a reference pattern of a password of a registered speaker’. A presentation section instructs the unknown speaker to input an ID and utter a specified text designated by a text generation section and a password. The ‘text verification’ of the specified text is executed by a text verification section, and the ‘speaker verification’ of the password is executed by a similarity calculation section. The judgment section judges that the unknown speaker is the authentic registered speaker himself/herself if both the results of the ‘text verification’ and the ‘speaker verification’ are affirmative. According to the device, the ‘text verification’ is executed using a set of speaker independent reference patterns, and the speaker verification′ is executed using speaker reference patterns of passwords of registered speakers, thereby storage capacity for storing reference patterns for verification can be considerably reduced. Preferably, ‘speaker identity verification’ between the specified text and the password is executed. The disadvantage of this system is that it involves a two part verification which is inefficient and cumbersome to individuals using the system.
Matulich et al., U.S. Pat. No. 6,188,986, describes a voice activated device for producing control signals in response to speech that is self contained and requires no additional software or hardware. The device may be incorporated into a housing that replaces a wall switch that is connected to an AC circuit. An alternate housing is portable and includes a jack that plugs into and lies flush against a standard AC utility outlet, and at least one plug for accepting an AC jack of any electronic product or appliance. The device acts as a control interface between utility power and connected electrical devices by connecting or disconnecting power to the electrical devices based on speech commands. The system described offers voice control of a self contained switching device using speaker independent recognition technology. The disadvantage is that it can be used by anyone and cannot distinguish between different individuals and is therefore unsuitable for controlling access restriction devices (such as locks, etc.) and also is unable to provide administrative features restricted to a select group of authorized individuals. The device disclosed also fails to intelligently switch between opposing states using a single, user selectable command. The device also does not intelligently track a state to change switching behavior using the said single, user selectable command.
Clements et al., U.S. Pat. No. 6,519,565, describes a security method that compares a present verbal utterance with a previously recorded verbal utterance by comparing time-frequency domain representations of the utterances, with multiple repeat utterances forming a basis for determining a variation in repetitious performance by an individual, and similar differences between enrollment and challenge utterances forming a basis for similar analysis of variance between enrollment and challenge utterances. In one embodiment a set of enrollment data is searched by each challenge until either a match is made, indicating an action, possibly dependent upon the specific match, or no match is made indicating an abort. In one application an individual is accepted or rejected as an impostor, in another application, a selected action is accepted as corresponding to a verbal command. This method defines a single security action possibly different for each voice command, but does not define a plurality of opposing signals for the same voice command spoken by the same individual depending on the current state of the host device. It also does not teach a method for using the system's verification result to offer administrative features to only a specific sub-set of authorized individuals.
Charlet, U.S. Pat. No. 7,409,343, teaches verification score normalization in a speaker voice recognition device. During a learning phase, a speech recognition device generates parameters of an acceptance voice model relating to a voice segment spoken by an authorized speaker and a rejection voice model. It uses normalization parameters to normalize a speaker verification score depending on the likelihood ratio of a voice segment to be tested and the acceptance model and rejection model. The speaker obtains access to a service application only if the normalized score is above a threshold. According to the invention, a module updates the normalization parameters as a function of the verification score on each voice segment test only if the normalized score is above a second threshold. This patent teaches one method for updating a previously stored speech reference pattern, but does so in a destructive manner. The disadvantage of this system is that an update to the reference pattern acts on the reference data as a whole and does not preserve any of the original proven reference patterns. This can lead to a “lockout: condition under which the updated reference pattern my be updated so as to be unusable by an individual in several locations do to the updated average from a sample captured in a different ambient audio environment than will be used on subsequent authorization attempts.
With so many systems relying on electronic circuits to control a variety of devices previously operated mechanically, there exists an opportunity to develop a way to offer an off-the-shelf consumer level, low cost solution that could be multi-purposed for additional uses other than physical access. Developing a solution of this type is not without challenges, some of which do not have obvious solutions. The present invention provides novel and non-obvious solutions to these challenges which will be revealed in this application.
The prior art teaches the comparing of voice utterances in order to perform an undefined “security action” by a hardware device supporting the voice technology. However, the prior art does not teach the design of a host electronic hardware device capable of controlling a wide variety of external devices, nor does it provide the necessary user interface and device management facilities. The present invention device fulfills this need and provides further non-obvious advantages in design as described in the following summary.