The invention relates to a method for generating a note-based code representing musical information. Further, the invention relates to a method for generating accompaniment to a musical presentation.
Generally, there are various prior art methods for producing control signals used for the control of electronic musical instruments or synthesizers. For example, MIDI is widely used for controlling electronic musical instruments. The abbreviation MIDI stands for Musical Instrument Digital Interface and this is a de facto industry standard in sound synthesizers. MIDI is an interface through which synthesizers, rhythm machines, computers, etc., can be linked together. Information on MIDI standards can be found e.g. from [1].
A non-heuristic automatic composition method is disclosed in [2]. This composition method utilizes a principle of self-learning grammar system called dynamically expanding context (DEC) in the production of a continuous sequence of codes by learning its rules from a given set of examples, i.e. similarly as in Markov processes, a code in a sequence of codes is defined in the composing method on the basis of codes immediately preceding it. The composition method, however, uses discrete xe2x80x9cgrammaticalxe2x80x9d rules in which the length of the contents of the search arguments of the rules, i.e. the number of required preceding codes, is a dynamic parameter which is defined on the basis of discrepancies (conflicts) occurring in the training sequence (strings) when the rules are being formed from the training sequences. In other words, if two or more rules have the same search argument but different consequences, i.e. a new code, during the production of the rules, these rules are indicated to be invalid, and the length of their search argument is increased until unambiguous or valid rules are found. The method of dynamically expanding the context is to a very great extent based on the utilization of this structure. As the mentioned rules are produced mechanically on the basis of local equivalences between symbols occurring in the training material, the production of rules does not, for instance, require music-theoretical analysis based on expertise on the training music material.
Correspondingly, when the rules are utilized to generate a new code after a sequence of codes, the code generated last in the code sequence is first compared with the rules in a search table stored in the memory, then the two last codes are compared, etc., until equivalence is found with the search argument of a valid rule, whereby the code indicated by the consequence of this rule can be added last in the sequence of codes. The above-mentioned tree structure enables systematic comparisons. This results in an xe2x80x9coptimalxe2x80x9d sequence of codes which xe2x80x9cstylisticallyxe2x80x9d attempts to follow the rules produced on the basis of the training sequences.
According to the prior art, the key sequence (a note-based code) for an automatic accompanist can be produced for example by a MIDI keyboard that is connected to a MIDI port in a computer, or it can be loaded from a MIDI file stored in a memory. The MIDI keyboard produces note events comprising note-on/note-off event pairs and the pitch of the note as the user plays the keyboard. For the accompanist the note events are converted into a sequence of single length units, e.g. quavers (xe2x85x9 notes), of the same pitch. The key sequence can also be given by other means; for example by using a graphical user interface (GUI) and an electronic pointing device, such as a mouse, or by using a computer keyboard.
An object of the present invention is to provide a method for generating a note-based code representing musical information and further a method for generating accompaniment to a musical presentation. This and other objects are achieved with methods and computer software which are characterized by what is disclosed in the attached independent claims. Preferred embodiments of the invention are disclosed in the attached dependent claims.
The method according to the invention is based on receiving musical information in the form of an audio signal and applying an audio-to-notes conversion to the audio signal for generating the note-based code representing the musical information.
The audio signal is produced for example by singing, humming, whistling or playing an instrument. Alternatively, the audio signal may be output from a computer storage medium, such as a CD or a floppy disk.
In a further method according to the invention, the note-based code generated on the basis of an audio signal by the audio-to-notes conversion is used for controlling an automatic composition method in order to provide accompaniment to a musical presentation. The automatic composition method has been described in the background part of this application. The automatic composition method generates a code sequence corresponding to new melody lines on the basis of the note-based code. This code sequence may be used for controlling a synthesizer or a similar electronic musical device for providing audible accompaniment. Preferably, the accompaniment is provided in real time. The code sequence corresponding to new melody lines may also be stored in a MIDI file or in a sound file. Herein, the term xe2x80x98melody linexe2x80x99 refers generally to a musical content formed by a combination of notes and pauses. In contrast to the new melody lines, the note-based code may be considered as an old melody line.
The audio-to-notes conversion method according to the invention comprises estimating fundamental frequencies of the audio signal for obtaining a sequence of fundamental frequencies and detecting note events on the basis of the sequence of fundamental frequencies for obtaining the note-based code.
In an audio-to-notes conversion method according to an embodiment of the invention, the audio signal containing musical information is segmented into frames in time, and the fundamental frequency of each frame is detected for obtaining a sequence of fundamental frequencies. In the next phase, the fundamental frequencies are quantized, i.e. converted for example into a MIDI pitch scale, which effectively quantizes the fundamental frequency values into a semitone scale. The segments of consecutive equal MIDI pitch values are then detected and each of these segments is assigned as a note event (note-on/note-off event pair) for obtaining the note-based code representing the musical information.
In an audio-to-notes conversion method according to another embodiment of the invention, the audio signal containing musical information is processed in frames. The fundamental frequency of each frame is detected and the fundamental frequencies are quantized. As distinct from the previous embodiment, the frames are processed one by one at the same time as the audio signal is being provided. The quantized fundamental frequencies are coded into note events in real time by comparing the present fundamental frequency to the previous fundamental frequency. Any transition from zero to a non-zero value is assigned to a note-on event and a pitch corresponding to the current fundamental frequency. Accordingly, a transition from a non-zero to a zero value results in a note-off event and a change from a non-zero to another non-zero value results in a note-off event and a note-on event after the note-off event and a pitch corresponding to the current fundamental frequency. Hence, the note-based code representing musical information is constructed at the same time as the input signal is provided.
In an audio-to-notes conversion method according to still another embodiment of the invention, the audio signal containing musical information is processed in frames, and the note-based code representing musical information is constructed at the same time as the input signal is provided. The signal level of a frame is first measured and compared to a predetermined signal level threshold. If the signal level threshold is exceeded, a voicing decision is executed for judging whether the frame is voiced or unvoiced. If the frame is judged voiced, the fundamental frequency of the frame is estimated and quantized for obtaining a quantized present fundamental frequency. Then, it is decided on the basis of the quantized present fundamental frequency whether a note is found. If a note is found, the quantized present fundamental frequency is compared to the fundamental frequency of the previous frame. If the previous and present fundamental frequencies are different, a note-off event and a note-on event after the note-off event are applied. If the previous and present fundamental frequencies are the same, no action will be taken. If the signal level threshold is not exceeded or if the frame is judged unvoiced or if no note is found, it is detected whether a note-on event is currently valid and if a note is found, a note-off event is applied. The procedure is repeated frame by frame at the same time as the audio signal is received for obtaining the note-based code.
An advantage of the method according to the invention is that it can be used by people without any knowledge of musical theory for producing a note-based code representing musical information by providing the musical information in the form of an audio signal for example by singing, humming, whistling or playing an instrument. A further advantage is that the invention provides means for generating real time accompaniment to a musical presentation.