There has heretofore been known a device for recognizing the voice, which collects the voice uttered by a user by using microphones, compares the voice with a pattern of voice that has been stored in advance as a recognized word, and recognizes a recognized word having a high degree of agreement as the word uttered by the user. The device for recognizing the voice of this kind has been incorporated in, for example, a car navigation device, etc.
It has also been known that the voice recognition factor of the device for recognizing the voice is dependent upon the amount of noise components contained in the voice signals input through the microphones. To solve this problem, the device for recognizing the voice is provided with a device for extracting the voice, which selectively extracts only those voice components representing the feature of voice of the user from the voice signals input through the microphones.
According to a known method of extracting the voice, the sound in the same room is collected by using a plurality of microphones, and the voice components are separated from the noise components based on the signals input through the plurality of microphones to thereby extract the voice components. According to the method of extracting the voice, the voice components are selectively extracted by the independent component analysis method (ICA) by utilizing the fact that the voice components and the noise components contained in the signals input through the microphones are statistically independent from each other (e.g., see Te-Won Lee, Anthony J. Bell, Reinhold Orglmeister, “Blind Source Separation of Real World Signals”, Proceedings of IEEE International Conference Neutral Networks, U.S.A., June 1997, pp. 2129-2135, the contents of which are incorporated herein by reference).
However, the above conventional technology involves the following problems. That is, in the conventional method of extracting the voice based on the independent component analysis, the number of microphones provided in the space must be equal to the number of independent components contained in the voice signals (i.e., a number one representing the extracted voice component is added to a number equal to the number of noise components). Even when the voice components are extracted by relying upon the conventional method of independent component analysis by providing the microphones in a plural number, there remains a problem in that the voice components cannot be suitably extracted when the number of noise components (i.e., the number of the noise sources) varies from time to time.
Further, there remains a problem in that the hardware constitution becomes complex when the signals input through the plurality of microphones are to be processed. In particular, a storage medium (memory, et.) of a large capacity must be provided for storing the input signals (digital data), thereby driving up the cost of production when the input signals from the microphones are to be digitally processed.