1. Field of the Invention
The invention relates to a method for comparing at least one acoustic input signal fed into an input device with at least one further acoustic reference signal stored in a memory, in accordance with which
a harmonic analysis of the input signal is carried out in order to produce a Fourier spectrum in a frequency analyzer connected to the input device and the memory, in accordance with which, furthermore,
the Fourier spectrum is used to define an n-dimensional input signal vector with n input (signal) coordinates, and in accordance with which
the input signal vector is checked for correspondence with at least one reference signal vector, which is defined in the same way and thereafter stored in the memory.
2. The Prior Art
In the case of a method of the configuration described at the beginning, a circuit arrangement for voice recognition is described which is equipped with an evaluation circuit for determining spectral feature vectors. In detail, an evaluation is made in this case of a time frame of a digital voice signal by means of a spectral analysis. The feature vectors obtained are compared with reference feature vectors. Before the comparison, the evaluation circuit performs a recursive high-pass filtering of the spectral feature vectors. The aim hereby is to enable recognition independent of speaker and to reduce the influence of interference on the result of recognition (compare DE 41 11 995 A1).
Moreover, there is a description in the prior art of a voice analysis system in which time segments of a voice signal are selected and a series of spectrum components of each voice segment are determined. These spectrum components form a discrete Fourier transform of samples of the voice signal. It is now possible to derive the position of significant peaks in each time segment from the series of relevant spectrum components.
Moreover, the procedure here is to select a value for the pitch. Intervals are defined around this initial value and a number of sequential integral multiples thereof. These intervals are regarded as openings in a mask, specifically in the sense that a frequency value which coincides with an opening is passed by the mask. Thus, in this sense the mask acts as a type of filter for frequency values. The aim in this way is for the described voice analysis system to be insensitive to interference signals and to require fewer calculations (compare DE 29 49 582 A1).
In a method known from practice, the general procedure is for an acoustic input signal, for example a tone, a tone sequence or else voice, to be compared with a stored acoustic reference signal and, in the event of correspondence, for a control device connected to the input device to be activated. This can relate to a door in connection with access control. Driving a machine is also conceivable. Corresponding attempts are also being made in the automotive sector in order, for example; to control individual functions by means of voice, for example to set flashers, switch lights on and off, etc.
The methods and devices previously considered are generally of very complex design and configuration, because in the end a complete analysis of the input signal is performed. The invention starts from this point.
The technical problem on which the invention is based is to develop such a method so as to achieve recognition with a simple means and quickly as well as with high accuracy. Moreover, an appropriately adapted device is to be specified.
In order to achieve this object, the invention proposes in the case of a method of the generic type and in accordance with a first alternative that the respective reference signal vector is flanked by a safety space or definition space, preferably adapted to the number of the reference signal vectors to be stored, and by an identity space, in which case the input signal vector is usually checked with the reference signal vector for correspondence of respectively corresponding coordinates within prescribed tolerances, and in which case, furthermore, the input signal vector is identified as being equal to the reference signal vector at least whenever it (the input signal vector) is situated inside the identity space and in which case respective reference signal vectors including the definition space have no overlap, that is to say the definition or safety spaces of all the reference signal vectors do not overlap one another. As a rule, recourse is made to at least two reference signal vectors. Otherwise, these reference signal vectors are, as mentioned, configured such that the intersections of respective reference signal vectors including the definition space is an empty set.
According to a preferred refinement, furthermore, the input signal vector to be compared with the reference signal vector is not detected outside the respective identity space. Alternatively, it is also possible to proceed so as additionally to define an option space which surrounds the identity space with a gray zone and serves, so to speak, as a collecting net for input signal vectors which cannot be assigned to an identity space of a reference signal vector.
The result in any case is that it is possible to compare the input signal vector and reference signal vector for correspondence in a way which is more accurate and simpler. The reason is that the safety space or definition space can be of variable configuration just like the identity space. Usually, the procedure here is that the reference signal vectors including the safety space exhibit no overlap. Since the configuration of the identity space is smaller than or equal to the safety space, input signal vectors can be interpreted as being equal or identical to the corresponding reference signal vector at least within the identity space, even if large tonal or voice deviations are to be noted in part.
Additionally, it is possible to add the described option space, which defines a type of gray zone around the identity space. This gray zone can be dimensioned such that an input signal vector can be assigned two (or more) reference signal vectors within this regionxe2x80x94and only this region. This is generally not desirable, but offers advantages under some circumstances. In any case, owing to this property this gray zone acts, so to say, as a collecting net for input signal vectors which cannot be assigned to an identity space of a reference signal vector. This may be ascribed, inter alia, to the fact that input signal vectors are not detected as a rule outside the respective identity space.
According to a further proposal of the invention, which is of independent importance, in the case of a method of the generic type it is proposed that with the aid of an adjustable n-fold filter respectively preselectable characteristics of the Fourier spectrum are evaluated in the frequency analyzer and converted into the n input (signal) coordinates of the input signal vector. In this case, as well, respectively corresponding coordinates of the input signal vector and the reference signal vector are regularly checked here for correspondence within prescribed tolerances.
The prescribable characteristics can be the highest frequency, the maximum amplitude, the duration or relative gains of salient frequencies or the like of the input signal. Of course, a comparable statement also holds for the reference signal vector, which can likewise be defined via the abovementioned characteristics of highest frequency, maximum amplitude, duration and relative gains of salient frequencies.
Of course, it is also possible to use additional coordinates based on other characteristics. It is conceivable here to determine the number of individual peaks in the Fourier spectrum. It would also be possible to determine a coordinate as the sum of the individual amplitudes, and thus as the amplitude integral over the frequency.
In the final analysis, this depends on the number of stored reference signal vectors with which the input signal vector must be compared: the trend is that more coordinates of the individual vectors are required the more reference signal vectors are present, and the closer the nature of their acoustic spacing. In other words, in the case of (a small number of) reference signal vectors of acoustically completely different formation, a relatively coarse grid with a low number of coordinates is generally sufficient. The closer (acoustically) the reference signal vectors, and thus also the input signal vectors, come to one another, the more coordinates then naturally have to be used for differentiation. It is to be taken into account in this case that both input signal vector and reference signal vector must in each case be of n-dimensional design so that the comparison can be carried out sensibly in an (n-fold) comparator used conventionally (where n is the number of the selected characteristics of the Fourier spectrum). According to a preferred embodiment, the invention further provides that the Fourier spectrum is recorded with a time constant adapted to the maximum length of the input signal, so that the Fourier spectrum can be exactly mapped onto the signal vector.
The abovementioned characteristics of the Fourier spectrum are determined and converted in the frequency analyzer with the aid of the adjustable n-fold filter. It is conceivable, for example, to evaluate the highest frequency, or else relative gains of salient frequencies in such a way as to use appropriately designed frequency filters. A similar statement holds for amplitude filters within which the maximum amplitude can be determined. In the simplest case, the duration of the signal can be measured via a timer or time filter. Recourse may be made to a summer as a filter, so to speak, for the amplitude integration over the frequency. The number of individual peaks can be detected with the aid of an amplitude filter in conjunction with a downstream counter. It is clear from the above that all the filters mentioned can be designed to be easily adjustable, with the result that the described possible selections of the characteristics of the Fourier spectrum can be represented in this way.
After evaluation of these properties with the aid of the adjustable filters in the frequency analyzer, it is possible to sum (and average) the found (measured) values over the time constant in one (or more) integrator(s). It is possible at their output to tap and further process corresponding values for the input coordinates of the input signal vector or reference coordinates of the reference signal vector.
In simple terms, this means that mathematically the Fourier spectrum of the input signal vector of the reference signal vector is mapped onto an n-dimensional input signal vector or reference signal vector. In order to define the reference signal vector, it is possible in this case for a specific reference signal (as input signal, so to say) to be repeatedly subjected to multiple harmonic analysis in order to determine a signal vector. The individual signal vectors detected and evaluated in this case can be averaged in order to determine the reference signal vector. This is performed in the simplest case in such a way that the individual reference coordinates of the respective signal vectors are added for the purpose of arithmetic averaging and divided by the number of the reference signal coordinates.
A preselectable number m of reference signal vectors with n reference coordinates can be stored in the memory by forming an mxc3x97n reference signal matrix there. Each point of this mxc3x97n reference signal matrix therefore corresponds to a specific reference coordinate which, for its part, has been derived from a characteristic of the Fourier spectrum.
The tolerances in the checking for correspondence (between the input signal vector and reference signal vector) are preferably formed as prescribed interval deviations of a respective reference coordinate of the reference signal vector by determining a respectively corresponding reference signal coordinates value range. In other words, the n-dimensional reference signal vector consists of individual reference coordinates which, for their part, span a respective value range, specifically the reference signal coordinates value range. In order to check the correspondence of the input signal vector and reference signal vector, the relative position of each input signal coordinate is determined by comparison with the associated reference signal coordinates value range. Correspondence obtains if the input signal coordinate is within this defined reference signal coordinates value range.
The invention also defines an identity space. This is the rangexe2x80x94the reference signal (vector) coordinates value range in the case of a reference signal vector, and the reference signal matrix coordinate value range in the case of a reference signal matrixxe2x80x94which permits a unique assignment of each individual input signal coordinate (of an input signal vector, or else of an input signal matrix) to the associated reference signal coordinate. In addition, it is also possible to determine a safety space which, so to speak, detects input a signal coordinates which cannot be uniquely assigned. This is explained in more detail with reference to the description of the figures.
The complete coincidence of the input signal vector with the reference signal vector (within the scope of the tolerances) is now determined when a prescribed number z, where zxe2x89xa6n, of input signal coordinates are situated within the respectively associated reference signal coordinates value range. Consequently, a further variation can be undertaken in the grid in order to determine correspondences by selecting the number z, in addition to determining the above described interval deviations and the number of the characteristics taken into account. The larger the number z by comparison with the dimension of the vectors (n) to be compared, the nearer (acoustically) the vectors to be checked have to come to one another.
A control device connected to the input device is activated in the event of correspondence of the input signal vector with the reference signal vector. This can be a central operating device for a door, a gate, an elevator etc. for the purpose of access control. Driving a machine as a whole is also conceivable. Of course, individual functions of an overall system can also be controlled in this way.
A device operating in accordance with the method according to the invention is the subject matter of patent claim 10. It is of particular importance within the scope of the invention that the comparison of the acoustic input signal fed in with one (or more) acoustic reference signals stored in the memory is performed in a way which is particularly simple, fast and efficient. In essence, this is achieved by dispensing completely with the detailed evaluation and the comparison of temporal amplitude characteristics, Fourier spectra or the like. Rather, the signal comparison is undertaken such that the signal sequences to be checked are transformed into the Fourier space, and the Fourier spectrum produced here is mapped onto an n-dimensional vector. In other words, the respective signal sequences are identified with n-dimensional vectors which permit a rapid and simple comparison with one another. The assignment of the Fourier spectrum to individual coordinates of the above-named n-dimensional vectors is performed in this case by means of the described filters in the frequency analyzer, while the comparison is carried out in an (n-dimensional) comparator. In any case, the Fourier spectrum can ideally be reduced to a sequence of (binary) data (or also analog values), which permit a simple, reliable and rapid comparison with one another. Moreover, the device can have a learning configuration by virtue of the fact that the reference signal or signals stored in the memory is/are updated in specific cycles. This is where the essential advantages of the invention are to be seen. All the above-described devices can, of course, be combined to form an overall system, for example a computer system.