The present invention relates to a tone signal processing apparatus and method for generating not only a lead note or tone on the basis of an input tone or voice but also an additional tone harmonious with the lead tone. More particularly, the present invention relates to a technique which, when a tone, voice or the like, frequently varying in pitch within a short time period, has been input, generates an additional tone that does not fluctuate in tone pitch (hereinafter also referred to as “pitch”) and thus has a sense of auditorily calm stability. The tone signal processing apparatus and method of the present invention are applicable to human-voice or musical-instrument-tone processing systems belonging to music-related equipment, such as karaoke apparatus, electronic musical instruments and personal computers.
Heretofore, there have been known tone signal processing apparatus and methods having a tone generation function which detects a pitch of a tone signal of an input tone, voice (typically, human voice) or the like (ultimately, detects a particular pitch corresponding to any one of the musical pitch names) to generate a tone signal of a lead tone (first tone signal) of the detected pitch, and which also separately determines a pitch (corresponding to any one of the musical pitch names) on the basis of the detected pitch and chord information input via a keyboard or the like to thereby automatically generate a tone signal of a harmony note or tone (second tone signal) of the determined pitch as a separate additional tone with the generated lead tone as a main tone. One example of such tone signal processing apparatus is disclosed in Japanese Patent Application Laid-open Publication No. HEI-11-133954 (hereinafter referred to as “the prior patent literature”). It should be appreciated that the term “tone signal” is used herein to refer to a signal of a voice or any other desired sound rather than being limited to a signal of a musical tone.
The following describe a conventionally-known tone generation processing procedure employed in the apparatus disclosed in the above-identified prior patent literature, with reference to FIG. 5. FIG. 5 is a conceptual diagram explanatory of the tone generation processing procedure, where the vertical axis represents frequency while the horizontal axis represents time. More specifically, FIG. 5 shows, on its left side section, a flow of processes performed in the apparatus and shows, on its right side section, variations of a signal waveform occurring in response to execution of the individual processes. Further, FIG. 6 is a conceptual diagram showing a data organization of a conventionally-known tone pitch determination table that is referenced in determining a pitch of a harmony tone as will be later described.
First, a sound signal input via a microphone or the like is subjected to a “frequency detection” process, where the input sound signal is converted into a frequency signal. Because this frequency detection” process may be performed using any desired conventionally-known technique, such as the zero-cross method well known in the field of sound analyses, a detailed description of this frequency detection process will be omitted. Then, the frequency signal is subjected to a “smoothing” process, where variations in the frequency signal are smoothed. Then, the smoothed frequency signal is subjected to a “pitch name detection” process, where the smoothed frequency signal is discretized, every predetermined time interval, into any one of pitch names of a twelve-note scale (i.e., note names). More specifically, for each of the predetermined time intervals the smoothed frequency signal is rounded to a predetermined normalized pitch corresponding to any one of the plurality of musical pitch names determined in semitones (100 cents) (the thus-rounded frequency signal will hereinafter be referred to as “pitch name signal”). In this way, normalized pitches of the input sound signal are detected. Then, in a “convergence curve” process, the detected pitches are converted into a signal continuously varying over time with a characteristic such that, every time the input sound varies in note, it smoothly varies in frequency from the pitch of the last note to the pitch of the new note. Further, in an “output modulation” process, each of the detected pitches of the input sound signal is modulated as appropriate so as to differentiate a pitch of a lead tone to be generated from the original pitch of the input sound. For convenience, in the graph of pitch variation depicted to the right of the rectangular block “output modulation” of FIG. 5, there is shown an example where the detected pitch of the sound signal itself is determined as a pitch of the lead tone to be generated without being subjected to the output modulation.
When adding a harmony tone to a lead tone, on the other hand, any one of pitch names of a twelve-note scale (i.e., note names) is determined in accordance with the pitch detection result of the input sound signal obtained in the aforementioned “pitch name detection” process (or pitch of the lead tone determined on the basis of the pitch detection result) and chord information input via a keyboard or the like and in accordance with the tone pitch determination table of FIG. 6 prepared in advance. Namely, the tone pitch determination table of FIG. 6 has a plurality of sub tables, one sub table per chord, prestored in a ROM, RAM or the like, and one of the sub tables is identified in accordance with chord information input via the keyboard or the like. In FIG. 6, only a sub table for a “C major” chord is shown by way of example. The thus-identified sub table is referenced immediately in response to (in immediate response to) the pitch detection of the input sound signal and on the basis of the pitch detection result, so that a particular pitch corresponding to any one of the musical pitch names is determined as a pitch of a harmony tone. In the tone pitch determination table of FIG. 6, “E0” indicates a note “E” of the same octave as the detected pitch of the lead tone, “C(+1)” indicates a note “C” one octave higher than the detected pitch of the lead tone, and so on. Thus, if the pitch of the lead tone is “E3”, then “G3” will be determined as a pitch of a first harmony tone, and “C4” will be determined as a pitch of a second harmony tone.
In the aforementioned manner, output signals of one or more harmony tones are generated by the “convergence curve” process and “output modulation” process being sequentially performed on the basis of pitch name signals comprising pitches corresponding to some of the pitch names of the twelve-note scale determined in accordance with the tone pitch determination table of FIG. 6, like in the generation of the lead tone. Note-on timing of the lead tone and harmony tones is when the pitch of the sound signal has been detected, while note-off timing of the lead tone and harmony tones is when the pitch of the input sound has come to be no longer detected.
As set forth above, the conventionally-known apparatus is constructed to determine a pitch of a harmony tone on the basis of a pitch detection result of an input sound signal (and hence a pitch of a lead tone), from which it can be understood that the pitch of the harmony tone depends on the pitch of the lead tone. So, if the input sound signal is of a human voice and this input sound signal is a signal whose pitch varies while fluctuating up and down beyond a semitone interval like a deep vibrato within a short time period, e.g. a time period from one vowel detection to next vowel detection, a harmony tone whose pitch continuously fluctuates more greatly than fluctuation of a lead tone may be generated. Such a harmony tone is undesirable in that it gives a sense of uncalmness and is uncomfortable to hear. For example, according to the tone pitch determination table shown in FIG. 6, if an input sound signal (and hence a lead tone) represents a vibrato varying between the pitch “E3” and the pitch “F3”, then a first harmony tone becomes an output signal with its pitch continuously varying to fluctuate between the pitch “G3” and the pitch “C4”. It means that, while the input sound signal varies in pitch by only one semitone, the harmony tone to be added to the lead tone repeats a sound leap with a pitch variation across a pitch interval as great as five semitones within a short time period, and such a harmony tone can hardly be used as an expression to a vibrato.
As another approach for avoiding the aforementioned inconvenience, it is conceivable to lower the frequency of the pitch detection of an input voice signal. However, if the frequency of the pitch detection is lowered, the responsiveness of the harmony tone (additional tone) generation process would undesirably become constantly low, which would result in lowered followability to a chord change and change in other performance conditions. Thus, this approach is unsatisfactory. Further, because the lead tone and harmony tone are each generated on the basis of the pitch detection of the input voice signal, the frequency of not only the harmony tone (additional tone) generation process but also the lead tone generation process would decrease, so that the musical characters, expressiveness, etc. of the input voice signal may be undesirably lost. For this reason too, the above-mentioned approach is unsatisfactory.