1. Field of the Invention
The present invention relates to an audio fingerprint, and more particularly, to device, method, and medium for generating audio fingerprints by extracting modulation spectrums robust to noises from audio data and retrieving audio data by the use of the generated audio fingerprints.
2. Description of the Related Art
A user uses an audio fingerprint identification technology to acquire in real time information on music output from an output unit such as a radio, a television, and an audio set.
In an audio fingerprint retrieval method by Philips®, an audio signal with a sampling rate of 5 kHz is divided into frames with a time length of 0.37 s, 11.6 ms is shifted from the respective frames, and then power spectrums are generated by the use of Fourier transform. Here, a Fourier transform band is divided into 33 frequency bands not overlapping with each other which are logarithmically in regions of 300 Hz and 2 kHz. Then, power spectrums in the respective logarithmic sub bands are summed to calculate energy. Difference in energy is calculated by a frame axis and a frequency axis, the calculated difference in energy is converted in bits, and then the bits are indexed by the use of a hashing method. However, in the audio fingerprint retrieval method by Philips®, since elements to be extracted are much affected by noises, retrieval ability for audio data recorded in noises is deteriorated and it is difficult to apply the method to a variety of environments.
In an audio fingerprint retrieval method by Fraunhofer®, first, power spectrums are generated by the use of an audio signal in the manner similar to the method by Philips®. Here, a Fourier transform band is divided into ¼ octave frequency bands in regions of 250 Hz and 4 kHz. Spectral flatness and spectral crest measure are extracted in the respective octave frequency bands to retrieve audio fingerprints. However, since the audio fingerprint retrieval method by Fraunhofer® is not resistant to noises and employs a statistical method and a vector quantization method, it has low accuracy and low retrieval speed.