U.S. Pat. No. 5,918,223 discloses a means for finding audio data files or segments of digital audio that sound similar to given sounds, or that sound similar to predefined classes of sounds. The system first measures a variety of acoustical features of each sound file. It measures the loudness, bass, pitch, brightness, bandwidth and Mel-frequency cepstral coefficients at periodic intervals over the length of the sound file. Then it computes specific statistical measurements, namely the mean and standard deviation of each of these features, to describe their variation over time. This set of statistical measurements is represented as an N-vector, also known as a feature vector. The user can create classes of sounds by specifying a set of sound files that belong to this class. In this case, the user selects samples of sounds that demonstrate the properties of sounds that demonstrate the property the user wishes to train. Each of the sample sounds are then used to compute an average vector for the set, μ, and a normalisation vector for the set, V (The normalisation values are either the standard deviation or range values). These vectors can be stored in separate database which defines categories. Once categories have been defined by providing sets of vectors which have a large degree of the property being defined, then we can compare individual sounds with the categories and come up with a distance measure between a sound and a category. This distance of the example vector A, to a category as defined by μ and V, is given by:distance=√{square root over (Σ(A[i]−μ[i]/V[i])2)}; i=0 to N−1.
The distance can be compared to some threshold value to determine whether the sound is “in” or “out” of the class. If it is known a priori that some acoustic features are unimportant for the class, these can be ignored or given a lower weight in the computation of distance.
A problem of the known method is that the calculated distance is based on the assumption that N-vectors defining a set are distributed evenly around the mean, and that each set is defined by the same number of N-vectors.