The present invention relates to a method for recognizing a melody from a set of stored melodies, in which an audio sample representing a melody to be recognized is produced to form a first search criterion. The invention also relates to a system for recognizing a melody comprising means for storing melodies, means for recognizing a melody from a set of stored melodies, means for producing an audio sample representing a melody to be recognized, and means for forming a first search criterion on the basis of said audio sample. Further, the invention relates to a dabatase server comprising means for storing melodies, means for recognizing a melody from a set of stored melodies, means for receiving an audio sample representing a melody to be recognized, and means for forming a first search criterion on the basis of said audio sample.
Tremendous growth in multimedia information has increased the need to develop search methods for searching for specific multimedia information, such as pieces of music or other melodies, from this multitude. Text-based search methods are known, whereby keywords entered by a user, e.g., on the keyboard of a computer can be used to search for a desired piece of music or another melody. These pieces of music are stored in a database to which a data transmission connection can be set up, or the data can be stored locally, for example, in jukeboxes or, e.g., on compact discs in the user""s own music archive. A drawback in this system is, inter alia, that the size of the database can be very large, wherein the search cannot be limited by a few key words to be sufficiently small, and it may be time consuming to find the exactly relevant information. Furthermore, the user of such a search system may not remember all the essential information about the piece of music to be searched for, wherein the search may result in a large number of possible pieces of music, of which the user must then try to find out the one that corresponds to the piece of music that was searched for.
Recognition methods and devices based on sound recognition have also been developed that can be used to search from audio samples, such as pieces of music, stored in a database, for a specific sample by humming a part of this piece of music to be searched for. Thus, in the system, a comparison is made on the basis of the humming and the pieces of music stored in the database. Such music-based databases have been compiled, e.g., in servers coupled to the Internet data network, wherein these databases can be globally accessed. The quantity of information contained in such databases is very large, wherein searching by humming can take a considerably long time. Furthermore, the length of search keys required for the searches increases with the quantity of information contained in the databases, which may impair the accuracy of the recognition. The recognition is further made more difficult by the fact that people hum the same audio sample in different ways, wherein it is difficult for the recognition system to find the exactly relevant piece. Thus, the retrieval system will give as the retrieval result numerous pieces of music which possibly correspond to the audio sample that was hummed. After this, the user must still find out by listening which of the pieces of music given by the retrieval system correspond to the piece searched by the user. In some situations, it may even happen that the retrieval system does not find the piece that the user tried to hum. In such a system based on acoustic recognition, the audio sample used can be, instead of or besides humming, an audio sample that is whistled and/or recorded or played with an instrument. In such a system based on acoustic recognition, solutions are used in which the aim is not to find quite precise conformity, but slight differences are allowed in the search, e.g., due to the above-mentioned sources of error.
The U.S. Pat. No. 5,874,686 presents an apparatus and a method for searching for a melody. The user hums a piece of music to be searched, and this humming is input in a computer for processing. The humming is converted to a sequence of digitized representations of relative pitch differences between successive notes. After this, the database is searched for a piece of music or melody which at least roughly resembles the digitized sample sequence formed of the humming.
In such search systems, it is important for the user that the system gives some kind of a response to entering the search command as quickly as possible. Furthermore, users prefer communication between people to conventional communication between man and machine. Thus, the implementation of the user interface of the search system should be significantly considered to avoid inconvenient and slow use of the search system. Furthermore, the use of text-based systems, e.g., on the keyboard of a portable device may in some situations be difficult.
To accelerate the search, the information contained in the database can be divided into smaller sub-areas, e.g., upon the collection of the database, wherein the search is subjected to one sub-area at a time. However, such an arrangement has, e.g., the drawback that the user should be aware of this distribution of information to be able to first select the correct sub-area for the search.
It is an aim of the present invention to provide a method of searching for melodies, whereby the search can be accelerated compared to methods of prior art. It is also an aim of the invention to provide a system of searching for pieces of music. The invention is based on the idea that for searching for pieces of music, the user says one or several key words related to the piece to be searched for and also utters an audio sample of the piece of music to be searched for, e.g., by humming, whistling or in another way. More precisely, the method according to the invention is primarily characterized in that in the method, an audio sample is produced from at least one word related to the melody to be recognized to form a second search criterion, wherein in the recognition, a first search set is formed of the stored melodies on the basis of one said search criterion, and one other said search criterion is used for recognizing the melody from said first search set.
The system according to the present invention is primarily characterized in that the system further comprises means for producing an audio sample from at least one word related to the melody to be recognized to form a second search criterion, means for forming a first search set of the stored melodies on the basis of one said search criterion, and means for recognizing the melody from said first search set on the basis of one other said search criterion. Further, the database server according to the present invention is primarily characterized in that the database server further comprises means for producing an audio sample from at least one word related to the melody to be recognized to form a second search criterion, means for forming a first search set of the stored melodies on the basis of one said search criterion, and means for recognizing the melody from said first search set on the basis of one other said search criterion.
Considerable advantages are achieved by the present invention compared to search solutions of prior art. When applying the method of the invention, the actual melody search can be restricted to a narrower field, wherein the search can be made faster and more accurately than when applying solutions of prior art. Furthermore, in the search system according to the invention, the audio signal uttered by the user is subjected to automatic separation of speech-containing parts from music-containing parts, wherein the use of such a search system more closely resembles communication between people than communication between man and machine. This makes the use of the system more convenient and faster compared to systems in which the user must, e.g., type search conditions on a keyboard.