Nowadays, encoding is often performed when transmitting or storing video signals and voice signals. Especially, in the case of video signals which carry enormous amounts of data, high-efficiency encoding (hereinafter called compression) is performed to reduce the amount of data, thereby reducing transmission or storage costs.
Encoded signals are subjected to decoding for reproduction; when the signals are in compressed form, expansion must also be performed. Generally, expansion often requires complex calculations, and in the case of moving images and the like which are to be processed in real time, high processing speed is needed.
There are, however, not a few cases where sufficient processing speeds cannot be obtained, for example, when performing processing by computer software. In such cases, appropriate decimation is applied to the input moving image signal to reduce the amount of data and to permit realtime processing.
One decimation technique frequently used is frame decimation. This technique decimates an input signal on a frame by frame basis when it is input.
A description will be given below of one example of a prior art video and voice signal processing apparatus in which an input moving image signal is first subjected to frame decimation and then to expansion and decoding, thus outputting a video signal and a voice signal. FIG. 7 is a block diagram showing the configuration of the prior art video and voice signal processing apparatus. In FIG. 7, 701 is a signal receiving circuit, 702 is a signal extraction circuit, 703 is a video signal processing circuit, 704 is a voice signal processing circuit, 705 is an input signal, 706 is an encoded video signal, 707 is an encoded voice signal, 708 is a video signal, and 709 is a voice signal.
FIG. 8 is an explanatory diagram showing individual frames of the input signal 705 arranged along the time axis. In FIG. 8, 801a and 802d are frames to be decoded, and 802b, 802c, 802e, and 802f are frames to be discarded by frame decimation.
The operation of the thus configured video and voice signal processing apparatus will be described. As an example, it is assumed here that two frames out of every three frames are discarded by frame decimation. In FIG. 7, the signal receiving circuit 701 receives the input signal 705, and supplies it to the signal extraction circuit 702. The signal extraction circuit 702 extracts signals corresponding to the frames 801a and 801d from the thus supplied input signal 705, separates the encoded video signal 706 and the encoded voice signal 707 from the extracted signals, and supplies them to the video signal processing circuit 703 and the voice signal processing circuit 704, respectively. At this time, signals corresponding to the frames 802b, 802c, 802e, and 802f of the input signal 705 are discarded. The video signal processing circuit 703 decodes the encoded video signal 706 thus supplied, and outputs the decoded signal as the video signal 708. Likewise, the voice signal processing circuit 704 decodes the encoded voice signal 707 thus supplied, and outputs the decoded signal as the voice signal 709.
However, in the above-described prior art configuration, not only the video signal but the voice signal is also decimated as a result of the frame decimation. Generally, when frame decimation is applied to a video signal, the video can still be recognized as a moving image though the motion becomes jerky; on the other hand, when frame decimation is applied to a voice signal, there occurs the problem that the voice can no longer be recognized as voice since signal continuity is lost.
The present invention has been devised in view of the frame decimation problem of the prior art, and an object of the invention is to provide a video and voice signal processing apparatus capable of outputting a voice signal recognizable as voice even when frame decimation is applied.
On the other hand, with the advance of digital signal technologies, sound is increasingly being recorded and reproduced using digital signals.
Nowadays, compact discs and minidiscs on which sound is recorded using digital sound signals are predominant as sound recording media for recording sound. In television broadcasting also, systems for transmitting both video and sound using digital signals are beginning to be employed, as seen in digital satellite broadcasting. Further, personal computers are increasingly used for sound processing such as recording or reproduction using digital sound signals; nowadays, with improvements in the performance of personal computers, it is not uncommon for personal computers to reproduce video and sound simultaneously by using digital signals.
As shown in FIG. 9, a prior art sound signal processing apparatus 1 is an apparatus that accepts at its input a reproduced sound signal A of a digital sound signal transmitted from a sound signal transmitting apparatus 2, and that converts it into an analog sound signal and outputs the analog sound signal as an output sound signal B.
In recent years, advances have been made in sound quality and sound multiplexing in stereo broadcasting or the like, and the reproduced sound signal A from the sound signal transmitting apparatus 2 may contain a large amount of information.
However, since the above prior art sound signal processing apparatus 1 converts the reproduced sound signal A into an analog sound signal in the order in which it is input, if the reproduced sound signal A is a signal containing a large amount of information, problems will occur, such as interruptions in the sound corresponding to the reproduced sound signal A, unless the speed with which the sound signal processing apparatus 1 converts the reproduced sound signal A into the analog output sound signal B is fast enough.