The present invention relates to a voice storage device and a voice coding device suitable for use in a digital coding recording type telephone answering device, or the other digital coding voice recorders.
The telephone answering device has been used to be incorporated into a fixed subscriber telephone terminal or a portable telephone terminal. The telephone answering device is a means to record a voice of a message sender in the built-in recording medium (magnetic tape or semiconductor memory), when the terminal user is unable to respond to the telephone call, also called as the voice mail.
In recent years, since semiconductor LSI circuits have become available which can implement a digital signal processing at a low cost, there has been proposed a telephone answering device using the system that compresses a talker's voice by a high efficiency coding algorithm such as the CELP (code excitation linear predictive coding) and stores the result in a recording medium. Thereby, the telephone answering device using the foregoing system will record more voices than one using the normal PCM (Pulse Code Modulation) system or the ADPCM (Adaptive Differential Pulse Code Modulation) system, when both of them use a recording medium having a same recording capacity. Further, the use of a semiconductor memory will implement to quickly select and reproduce a specific message out of plural massages.
Also, the telephone answering device has been incorporated into a portable telephone terminal. But, because of the user's demand for the miniaturization of the terminal, there is a heavy restriction to the capacity of the semiconductor memory that can be incorporated into the terminal. Accordingly, the use of the normal CELP system could not have realized a sufficient voice recording time to a requested degree.
From such situations, a method of combining a voice activity detector with the telephone answering device is accepted in practice. In this method, a talker's vocalization is monitored when recording a message voice through the coding compression in the medium. This is implemented by comparing a voice gain to a threshold, for example. Using this comparison result, in the time of an interval from a vocalization to a next (non-vocalization interval), namely in the interval of a comparably low importance, the coding and recording of the voice is suspended, and only the information of a continued time of the non-vocalization interval is recorded in the medium. As a result, the coding efficiency can apparently be increased, whereby the efficiency in use of the recording medium is enhanced.
With regard to the identification of a voice interval is proposed a method of using the information on the gain or pitch (frequency components) of a voice, but when the signal-to-noise (S/N) ratio against the background noise of the voice is deteriorated, the detection capability of the voice is apt to be lowered. Concretely, in the vocalization interval adjacent to a non-vocalization interval, the voices at the beginning of a word and the ending of a word are misidentified as the non-vocalization interval, and the voice is likely to be missed accordingly.