1. Field of the Invention
The present invention relates generally to a voice speed converting system for converting the voice speed of a sound signal, and more particularly, to a voice speed converting system utilized for an image and voice reproducing device for hearing voice at high speed or at low speed such as a laser disk or a VTR, a hearing aid system for converting a sound signal broadcasted to hearing-impaired listeners into a slow and easy voice to hear, a language learning machine for converting a voice in a foreign language spoken at native speed into a slow and easy voice to hear, and the like.
2. Description of the Prior Art
Examples of conventional techniques for converting the voice speed include an analog type time-scale expansion and compression technique. In a voice speed converting method using the analog type time-scale expansion and compression technique, however, simple thinning processing or repeated insertion processing of voice waveforms is only performed. Therefore, joints of a sound are discontinuous, whereby the quality of sound is deteriorated.
Examples of the time-scale expansion and compression technique in which a good quality of sound is obtained include a technique for detecting the pitch cycle of voice by digital signal processing and thinning or inserting a pitch portion by the detected pitch cycle or in integral multiples of the pitch cycle. In a voice speed converting method using this digital type time-scale expansion and compression technique, however, a sound signal is compressed or expanded at a uniform rate of compression or expansion irrespective of a silence section and a voice section in the sound signal. Accordingly, the reproduction speed in the voice section is too high at the time of reproducing a VTR at twice the speed, at the time of reproducing voice in a foreign language by a language learning machine, and the like so that voice cannot be easily caught.
In order to solve the above described problems, a voice speed converting method for discriminating between a silence section and a voice section in a sound signal, deleting the silence section, and expanding the voice section by the pitch cycle has been already developed. Such a method is disclosed in the following documents A and B.
Document A: TECHNICAL REPORT OF IEICE, SP92-56, HC92-33 (1992-09) entitled "A METHOD OF ABSORBING TEMPORAL ENLARGEMENT OF SPEECH LENGTHS IN THE VOICE SPEED CONVERTING SYSTEM FOR ELDERLY", issued by THE INSTITUTE OF ELECTRONICS, INFORMATION AND COMMUNICATION ENGINEERS.
Document B: TECHNICAL REPORT OF IEICE, SP92-150 (1993-03) entitled "EVALUATION OF SPEECH-RATE CONVERSION METHOD BY HEARING-IMPAIRED LISTENERS", issued by THE INSTITUTE OF ELECTRONICS, INFORMATION AND COMMUNICATION ENGINEERS.
According to this method, the reproduction speed in the voice section can be reduced, thereby making it easy to hear the voice. In this method, however, there are the following problems.
In a first conventional system disclosed in the document A, the processing load is large, whereby a high-speed operation is required, to increase power consumption. In a second conventional system disclosed in the document B, the deviation between video and voice becomes too great, which makes it difficult to grasp the contents, and the capacity of a memory for storing sound signals becomes tremendous, which increases costs.