In case of playbacking voice signals recorded in a recording medium, e.g., CD, cassette tape, video tape, a playback speed is sometimes converted from the standard playback speed. For example, in case of listening a prescribed amount of voice in a short time, the playback speed is increased; in case that it is hard to listen voice due to, for example, rapid speech, the playback speed is reduced. To convert the playback speed, a revolution speed of CD or a running speed of a tape is increased or reduced. However, in this playback method, frequency of voice signals read from the recording medium, e.g., CD, is changed according to change of the playback speed, so tone of the voice must be changed and it is hard to listen the changed voice.
Thus, a method for converting a playback speed without changing tone, which comprises a step of dividing original voice signals into a plurality of voice blocks An (n is a natural number) having a predetermined time length and a step of changing combination of the voice blocks, has been proposed. For example, in case of playbacking at double-speed, the voice blocks An are alternately playbacked (e.g., A1-A3-A5 . . . ), so that a playback time can be reduced to a half, and the voice can be playbacked without substantially changing tones because the frequency of the original voice signals are maintained to some extent.
Note that, the voice block is divided by a basic cycle, which is an inverse number of a basic frequency being the lowest frequency of frequency components included in the voice block of the original voice signals. Since the voice signals are always varied, the basic frequency is also varied and the time lengths between the adjacent voice blocks are usually different.
However, if the original voice signals are divided into a plurality of the voice blocks An by an improper time length, the signals of one voice block are discontinued to those of the voice block having the improper time length when combination of the voice blocks is changed to convert the playback speed, so rasping noises will be generated.
In another method, suitable dividing points of the voice blocks An of the original voice signals are defined on the basis of zero cross points of the original voice signals, and connecting points of the voice blocks are the zero cross points, so that noises can be reduced. Technologies for dividing voice signals at zero cross points are disclosed in, for example, Patent Documents 1-3.