1. Field of the Invention
The present invention relates to a method and device for editing singing voice synthesis data that directs control over synthesis of a singing voice. The invention also relates to a method for analyzing a singing voice that generates singing characteristics data used for editing singing voice synthesis data.
2. Description of the Related Art
There is known in the art of singing voice synthesis, a technique of synthesizing a singing voice based on singing voice synthesis data. The term singing voice synthesis data referred to here is sequence data including note data specifying a duration and pitch of a voice, and lyrics data associated with the note data, and sound control data. Examples of kinds of data included in the sound control data are volume control data for controlling a volume of a voice outputting lyrics indicated by the lyrics data, and pitch control data for controlling a pitch of the voice.
The singing voice synthesis data may be freely edited by a user and stored in a memory. The different kinds of data constituting the singing voice synthesis data, i.e., each of the pieces of note data, the lyrics data associated with each piece of note data, and the sound control data are read out from a memory in a sequential manner and supplied to a singing voice synthesizer by a sequencer. The singing voice synthesizer synthesizes singing voice signals that correspond to the lyrics indicated by the lyrics data, which are supplied by the sequencer, and have a pitch and voicing duration specified by the note data. The singing voice synthesizer then performs sound control such as volume and pitch control on the singing voice signals based on the sound control data, for output.
When an actual person sings, the first voicing of a phrase segmented by silent sections strongly characterizes the singer. One may desire that singing be made much more expressive by varying both volume and pitch at a start of a phrase. Japanese Patent Application Laid-Open Publication No. 2015-034920 (JP 2015-034920, hereinafter) discloses a technology in which a probability model is used to machine learn a relationship between pitch transitions of synthesized singing represented by reference music track data consisting of a combination of note data and lyrics data of a particular music track, with pitch transitions of reference singing data being obtained by actually singing the particular music track. Singing characteristics data that define the probability model are then generated.
One possibility for making singing more expressive is to generate singing characteristics data by using the technology of JP 2015-034920, and further generating sound control data to impart variation in a pitch and volume at a beginning of a phrase based on the singing characteristics data. In the technology of JP 2015-034920, however, the section for which the probability model performs machine learning is determined based on the note data of the reference music track data. Consequently, the technology of JP 2015-034920 is not able to obtain singing characteristics data that could be used to enhance musical expressivity in a section immediately before note-on, since the technology interprets such a section as a silent section, and thus differentiates the section from a voiced section.