This invention relates to apparatus for recording sound data onto and/or reproducing such recorded data from a voice recording medium. In particular, this invention relates to such apparatus using a solid memory device such as a semiconductor memory as a recording medium, rather than a tape or a disk, and being adapted to carry out data compression and expansion processes based on adaptive pulse code modulation at the time of recording and reproduction of voice data such that the memory capacity can be improved for the recording of digital voice data. Throughout herein, words "voice" and "sound" are to be interpreted broadly and may be used interchangeably.
Voice recording and/or reproduction apparatus such as tape recorders using a magnetic tape as a memory medium and mini-disks can be improved regarding their durability and miniaturized, inclusive of their electric cells and memory medium, only to a limited extent, and it is because they are mechanically driven. Such limitation can be eliminated if a semiconductor memory is used as memory medium because no mechanical driving will be required, but even with the recent improvements in the semiconductor memory capacity, it is difficult to obtain a memory capacity comparable to that obtainable with a tape. Only such apparatus capable of recording and reproducing voice data from a relatively short conversation or meeting are becoming available.
Prior art voice recording and reproduction apparatus using a solid memory as recording medium make up for their insufficient memory capacity by carrying out a data compression process when digital voice data are recorded onto the solid memory and a data expansion process in reverse when recorded data are reproduced. With some apparatus, it is possible to select whether the data compression ratio is increased in order to increase the recording time with the same memory capacity or decreased because the quality of the sound is more important. FIG. 6A shows a voice recording and reproducing apparatus 10 of this type which uses a known kind of adaptive pulse code modulation routine for data compression. The data compression ratio is varied by adjusting the clock frequency of the adaptive pulse code modulation circuit and other digital signal processing parts such that the sampling frequency fs and the cutoff frequency fc can be made variable.
Explained more in detail, the apparatus 10 shown in FIG. 6A may be in the form of a small box of the size of a cigarette lighter, containing therein a microphone (MIC) 2 serving as a voice input means, a speaker (SP) 3 serving as a voice output means, a keyboard (KEY) 4 through which instructions such as the starting and stopping of recording and reproduction are received, a controller 5 for controlling the overall operations of the apparatus, a memory device 6 which may comprise an electrically erasable non-volatile flash memory having a memory capacity of 4-32 Mb (megabits), and an IC element 11 serving not only to convert analog voice signals inputted from the microphone 2 into digital voice signals and recording them in the memory device 6 under the control of the controller 5 but also to read out digital voice data from the memory device 6, reproduce the voice signals and output them to the speaker 3.
The IC element 11 for voice recording and reproduction includes a low pass filter (LPF) 12 and an A/D conversion circuit (A/D) 13 for receiving an output from the microphone 2 to convert analog voice signals into digital voice data, and there is also provided an adaptive pulse code modulation (ADPCM) circuit 14 for compressing the digital voice data. For reproduction, the same ADPCM circuit 14 is used in reverse for an expansion process, and there are included a D/A conversion circuit 15 for receiving the output from the ADPCM circuit 14 to convert digital voice data into analog voice signals and another low pass filter (LPF) 16. Voice signals outputted from this low pass filter 16 are amplified by an amplifier (not shown) and then transmitted to the speaker 3.
The IC element 11 also contains a clock circuit (CLK) 17 for providing a clock signal to the clock-synchronous digital signal processing part composed of the A/D conversion circuit 13, the ADPCM circuit 14 and the D/A conversion circuit 15. The clock circuit 17 is adapted to output a clock signal of frequency 1024 KHz (referred to as clock signal y) obtained by waveforming a base oscillation signal and a frequency-divided clock signal of frequency 512 KHz (referred to as clock signal x) and also includes a switch circuit for supplying either of the clock signals to the digital signal processing part. There is also provided a control unit (CONTROL) 18 for generating a control signal A and receiving various commands from the controller 5 to generate various individualized control signals for internal circuits of the IC element 11 such as the ADPCM circuit 14. The IC element further includes a memory interface (I/F) 19 for serving not only to generate signals for memory access but also to carry out processes such as changing the number of bits per processing unit and buffering.
The A/D conversion circuit 13, of which a major portion is shown in FIG. 7A, is of the so-called oversampling type commonly used, for example, in an ASIC, serving also as a filter which will pass only the components in the band 0-4 KHz inclusive of the vocal band, as shown in FIG. 7B, the sampling frequency fs becoming 8 KHz when operating by clock signal y. In FIG. 7B (and other similar figures throughout herein), the horizontal axis indicates frequency in units of KHz and the vertical axis indicates the transmissivity of the corresponding frequency component.
When the operation is by clock signal x, since the speed of the operation is one half, the sampling frequency fs becomes 4 KHz and it serves as a filter adapted to pass only the frequency range of 0-2 KHz included in the vocal band range, as shown in FIG. 7C.
Although not explained in detail, the D/A conversion circuit 15 also serves as a filter similarly at the time of digital-to-analog conversion, serving to pass only the components in the frequency range of 0-4 KHz when operating by clock signal y, as shown in FIG. 6E1, and passing only the components in the frequency range of 0-2 KHz when operating by clock signal x, as shown in FIG. 6E2. Since a filtering process with a sharp cutoff characteristic can thus be effected by such A/D and D/A conversion circuits 13 and 15, the low pass filters 12 and 16 which sandwich them in between are required only to remove reflected noise and the noise in clock signals x and y. Thus, simple low pass filters having a gentle cutoff characteristic beyond the vocal frequency range (over 4 KHz), as shown in FIGS. 6B and 6F, are sufficient.
The ADPCM circuit 14 is of a general adaptive pulse code modulation type shown in FIG. 8A, serving to codify the differentials, or the increments, in the waveform of the input signal per unit time, as shown in FIG. 8B. If the number of bits representing the length of the code (hereinafter referred to as the code bit number) is 4, for example, one of code values "0"-"15" is assigned to each value of the differentials "-8", "-7", -"-1", "+1", "+2", -"+8". The quantization noise of this ADPCM circuit 14 increases rapidly, like the D/A conversion circuit 15 although the frequency characteristic changes, depending on whether operating by clock signal x or y, as the frequency approaches from an intermediary value to the sampling frequency fs. As a practical example, the noise increases at frequencies from about 4 KHz to 8 KHz when recording or reproducing by clock signal y (as shown hatched in FIG. 8C) and from about 2 KHz to 4 KHz when recording or reproducing by clock signal x (as shown hatched in FIG. 8D).
When the apparatus 10 thus structured is used for a short-term recording, the keyboard 4 is operated to select a high-fidelity recording mode. The controller 5 responds by causing the control unit 18 to output a corresponding control signal A such that clock signal y with frequency 1024 KHz is supplied from the clock circuit 17 to digital signal processing unit inclusive of the ADPCM circuit 14. Analog voice signal inputted through the microphone 2 is passed through the LPF 12 to have high-frequency components over the vocal frequency band mostly eliminated (as shown in FIG. 6B). It is then converted into digital voice data by means of the A/D conversion circuit 13 and high-frequency components over 4 KHz are simultaneously cut off as shown in FIG. 6C1. The digital voice data are compressed by the ADPCM circuit 14 with the timing of the sampling frequency of 8 KHz by 4-bit encoding and sequentially recorded through the memory interface I/F 19 into the memory device 6. If the capacity of this memory device 6 is 32 Mb, the recording continues for about 16 minutes.
At the time of reproduction, the compressed encoded data are read out from the memory device 6 through the memory interface I/F 19 by the ADPCM circuit 14 operating on the same clock signal y, and the original voice data are reproduced through an expansion process. Although many noise components are contained in the high-frequency region near 8 KHz (as shown hatched in FIG. 6D1), the band components over 4 KHz are cut (as shown in FIG. 6E1) when the data are converted into an analog voice signal by means of the D/A conversion circuit 15 on the downstream side, and the noise components corresponding to the clock signal y caused by the D/A conversion circuit 15 are finally cut (as shown in FIG. 6F) such that a high-quality sound can be outputted from the speaker 3.
When this apparatus 10 is used for a long-term recording, the keyboard 4 is operated to select a long-term recording mode. The control unit 18 outputs a corresponding control signal (also indicated by the same symbol A for convenience), causing clock signal y of frequency 512 KHz to be supplied to the ADPCM circuit 14, etc. After an analog voice signal inputted through the microphone 2 is passed through the LPF 12 to have high-frequency components over the vocal frequency band mostly cut (as shown in FIG. 6B), it is converted into digital voice data by the A/D conversion circuit 13 and high-frequency components over 2 KHz are then cut off (as shown in FIG. 6C2). The digital voice data are compressed by the ADPCM circuit 14 with the timing matching the sampling frequency of 4 KHz by 4-bit encoding and then sequentially recorded in the memory device 6. If the capacity of this memory device 6 is the same (32 Mb), the recording takes about 32 minutes.
At the time of reproduction, the compressed encoded data are read out from the memory device 6 by the ADPCM circuit 14 operating by the same clock signal x, and the original voice data are reproduced through an expansion process. Although many noise components are contained in the high-frequency region below 4 KHz (as shown hatched in FIG. 6D2), the band components over 2 KHz are cut (as shown in FIG. 6E2) when the data are converted into an analog voice signal by means of the D/A conversion circuit 15 on the downstream side, and the noise components corresponding to the clock signal x are cut by the LPF 16 (as shown in FIG. 7C), the reproduced voice being outputted from the speaker 3. Because frequency components over 2 KHz, covering the vocal frequency band, are cut, the sound quality is somewhat affected but a sound reproduction which is good enough for distinguishing voice can be carried out for a time period twice as long as if the high-fidelity recording mode has been selected.
In summary, this apparatus 10 is capable of switching the clock frequency for the adaptive pulse code modulation process to thereby change the sampling frequency fs such that the compression ratio of voice data is altered and a longer recording time becomes possible by sacrificing the sound quality.
Besides the changing of sampling frequency fs, the changing of coding bit number may also be considered as a simple data compression method for extending the time of recording and reproduction with a limited memory capacity. FIG. 9 shows an apparatus 20 of this type with a voice recording and reproducing IC element 21. For convenience, like components of this apparatus 20 and the apparatus 10 shown in FIG. 6A and explained above are indicated by the same numerals and repetitive descriptions may be dispensed with.
The apparatus 20 shown in FIG. 9 has a simpler clock circuit 27 (than the clock circuit 17 of FIG. 6A), adapted to output only clock signal y of frequency 1024 KHz, but its ADPCM circuit 24 has a reduced (to 2 bits) code bit number and is adapted to assign the differentials "-2", "-1", "+1" and "+2" when the long-term recording mode is selected. If the code bit number is thus reduced, the range from which differentials can be taken is also reduced accordingly. As a result, the through-rate of the ADPCM circuit 24 is lowered, and many noise components appear in the high-frequency range below 4 KHz, as shown in FIG. 6D2. Thus, although the apparatus 20 of FIG. 9 functions similarly as the apparatus 10 of FIG. 6A if the high-fidelity mode is selected (as shown in FIGS. 6B, 6C1, 6D1, 6E1 and 6F), if the long-term recording mode is selected, the cutoff frequency of the D/A conversion circuit 15 is constant (over 4 KHz as shown in FIG. 6E1) and hence the corresponding quantization noise components are outputted, too.
In a situation like this, distortions of reproduced voice become large in the high range, making the reproduced voice difficult to understand. This was why prior art voice recording and reproducing apparatus using the adaptive pulse code modulation process did not change the data compression ratio by switching the code bit number but by changing the sampling frequency number fs. According to Japanese Patent Publication Tokkai 7-160300, different encoding and compression methods were used for compressing data by switching the code bit number.
It is not a practical approach, however, to combine an adaptive pulse code modulation circuit and a plurality of data compression methods for switching the code bit number because the scale of the circuit increases in spite of the general requirement for miniaturization. Neither is an increase in the scale of circuit desirable from the point of view of the cost.
The proposal to simply lower the sampling frequency fs is not acceptable because the components between 1-2 KHz, necessary for distinguishing the input voice, are totally lost.