1. Field of the Invention
The present invention relates to a system, method and apparatus in which encoding using a motion vector can be performed efficiently in the case of using AV apparatuses in combination to encode a video signal for network transmission.
2. Description of the Related Art
In general, using Internet and other networks, live video currently obtained from, for example, video phone, video conference and remote security monitoring is transmitted, and recorded video, for example, a video content is also transmitted to be distributed. Such transmission of video via networks has been performed using dedicated terminals, but lately AV apparatuses such as a video camera, VTR, PC (Personal Computer), network-supported television receiver, STB (Set-Top Box) and telephone unit have been combined to transmit such video.
In order to transmit video signals via a network, the video signals are normally encoded and compressed before being transmitted to the network in relation to a bandwidth of the network. For example, when a system in which a video signal is encoded for video phone is obtained using one dedicated terminal incorporating an imaging unit such as CCD camera and an encoding circuit, the dedicated terminal can be designed to balance the volume of video signals with encoding processing in processing steps where the video signal supplied from the imaging unit is encoded using the encoding circuit to be transmitted.
On the other hand, in the case where a video camera or the like is used as the imaging unit, and a PC and network-supported television receiver are used as the encoding circuit, the video signal coding system for video phone is obtained using these apparatuses. In such case, the volume of video signals and encoding processing are not necessarily optimized, since these apparatuses are originally made for different purposes.
FIGS. 1 and 2 are block diagrams showing examples in related art in which such video signal coding systems for network transmission are configured using a combination of AV apparatuses, respectively. It should be noted that an apparatus on the side of encoding video signals, such as a PC, network-supported television receiver and STB is herein termed a “signal conversion apparatus” and an apparatus on the side of supplying the video signals to the “signal conversion apparatus”, such as a video camera and VTR is herein termed a “video output apparatus”.
FIG. 1 shows an example in which a decoding unit 101 in a video output apparatus 100 decodes a video signal encoded using, for example, the MPEG-2 into a video signal such as an uncompressed composite signal. Subsequently, the uncompressed video signal is output from the video output apparatus 100 and input into a signal conversion apparatus 102.
The input video signal is first supplied to a size/number of frames converting unit 103 in the signal conversion apparatus 102. The size/number of frames converting unit 103 reduces a screen size and the number of frames of the video signals based on size/frame information (information specifying the screen size and the number of frames corresponding to a display apparatus on the other end of video phone), thereby reducing the volume of video signals to be suitable for network transmission. The video signals the volume of which is thus reduced are encoded by an encoding unit 104 using a coding method for video phone (for example, H.261 and MPEG-4 Part 10 (AVC)), and transmitted to the network from a network interface (not illustrated).
FIG. 2 shows an example in which a video signal encoded using the MPEG-2 or the like is output from a video output apparatus 200 and input into a signal conversion apparatus 201.
The encoded video signal is first decoded by a decoding unit 202 and afterward supplied to a size/number of frames converting unit 203 in the signal conversion apparatus 201. The size/number of frames converting unit 203 reduces a screen size and the number of frames of the video signals based on size/frame information, thereby reducing the volume of video signals to be suitable for network transmission. The video signals the volume of which is thus reduced are encoded by an encoding unit 204 using a coding method for video phone, and transmitted to the network from a network interface (not illustrated). (A motion vector converting unit 205 is described later.)
Among video transmissions through networks, in particular, video phone, video conference, and remote security monitoring may require highly efficient encoding for real-time video transmission. A method using a motion vector is generally used as a highly efficient coding method, however processing of detecting the motion vector at the time of encoding may require a huge amount of calculation that is more than half the whole encoding processing.
Therefore, in the case where the uncompressed video signal is input into the signal conversion apparatus 102 as shown in the example of the configuration according to FIG. 1, a large-scale circuit that performs a great amount of calculation for detecting the motion vector may need to be provided in the encoding unit 104. As a result, not only the cost of such large-scale circuit raises a product price, but also a large amount of power consumption caused by this circuit may be inconvenient for a consumer.
Japanese Unexamined Patent Application Publication No. 2001-238218 (paragraphs 0024 to 0026, and FIG. 1), on the other hand, discloses the following technology with respect to the case where the encoded video signal is input into the signal conversion apparatus as shown in the example of the configuration according to FIG. 2. In this proposed technology, the motion vector converting unit 205 is provided to convert a motion vector output from the decoding unit 202 into a motion vector corresponding to the coding method implemented in the encoding unit 204 (and corresponding to the reduced screen size and number of frames specified by the size/frame information). Accordingly, the calculation cost necessary for the encoding is reduced in comparison to such a case that the motion vector is detected from scratch.
However, in the configuration according to FIG. 2, the signal conversion apparatus 201 may need to include the decoding unit 202 corresponding to the coding method of the video signal (for example, MPEG) implemented in the video output apparatus 200. Therefore, AV apparatuses that can be used as the signal conversion apparatus 210 are limited to those corresponding to the coding method implemented in the AV apparatus used as the video output apparatus 200. Accordingly, other AV apparatuses are prevented from representing the configuration shown in FIG. 2.
Further, in the case where the video output apparatus 200 is, for example, a camcorder including a high resolution CCD camera having a large number of pixels, and a video signal output from the CCD camera is encoded with a high compression ratio and output from the video output apparatus 200, a large processing capacity may also be required for decoding the high-compression video signal in the decoding unit 202 included in the signal conversion apparatus 201. Therefore, also in that case, a product price may be raised, and further a large amount of power consumption due to such processing may be inconvenient for a consumer.