In a remote picture monitoring system or picture distributing system, needs for a motion picture transmission apparatus using, as a transmission line, an IP (Internet Protocol) network typified by a public line or the Internet are rapidly increasing. For example, in distribution of stream data (which is compressed data) of a picture according to the MPEG-4 (Moving Picture Experts Group Phase 4) encoding, conventionally, picture data to be transmitted is coded in accordance with the MPEG-4 encoding and the coded picture is temporarily stored as stream data into a memory unit of a video transmission unit. The video data includes a still picture, a motion picture, CG (Computer Graphics), and animation and also includes sound, audio, synthetic music, and the like. The video data is distributed from the memory unit of the video transmission unit in response to a request from a network.
In order to distribute such video data, particularly, a motion picture, the video data has to be converted to digital data. In the case of converting the video data to digital data, the amount of information is enormous. Consequently, to reduce the transmission amount of the information, a motion picture compression technique is necessary. In this case, a world standard system of compression of a motion picture such as MPEG-2 or MPEG-4 encoding which has been conventionally well known is used.
FIG. 12 shows an example of a general network-type motion picture distribution system and shows, for example, a schematic configuration of a motion picture surveillance system. FIG. 12 shows, for example, a case of monitoring a motion picture of a surveillance camera 120 by video monitors 124-1, 124-2, and 124-3 in three places apart from the surveillance camera 120. When it is not necessary to particularly discriminate the video monitors 124-1, 124-2, and 124-3 from each other, they will be generically called the video monitor(s) 124. As networks often used, for example, an LAN (Local Area Network) 122-1, an ADSL (Asymmetric, Digital Subscriber Line) 122-2, and a third generation portable telephone network 122-3 such as W-CDMA (Wide-Code Division Multiple Access) are shown. When they do not have to be discriminated from each other, they will be generically called the network(s) 122. The network 122 is constructed by transmission lines of different transmission speeds. For example, the LAN 122-1 is a relatively high-speed network having transmission speed of about 6 Mbps, the ADSL 122-2 has transmission speed of 512 kbps, and the third generation portable telephone network 122-3 is a low-speed network having transmission speed of 384 kbps.
A surveillance picture captured by the surveillance camera 120 is coded by a video transmission unit 121 such as an encoder, and the encoded picture is distributed to video reception units 123-1, 123-2, and 123-3 such as decoders via the network 122 and decoded. The decoded picture is displayed as a monitor picture on each of the video monitors 124-1, 124-2, and 124-3.
The video transmission unit 121 for compressing a motion picture transmits picture compression data (also called a stream) generated by being compressed at a predetermined bit rate (compression ratio) by a compression processing unit 125 in the video transmission unit to the video reception units 123-1, 123-2, and 123-3. Each of the video reception units 123-1, 123-2, and 123-3 decompresses the stream to the original picture data and outputs the picture data to the monitor. In FIG. 12, a series of streams is transmitted from the video transmission unit 121 to the networks 122-1, 122-2, and 122-3. A transmission system in which the networks are connected in series is called a multicast configuration.
In the operation of the system, for example, a request for stream data is transmitted from the video reception unit 123-1 to the video transmission unit 121 via the network 122-1. The video transmission unit 121 distributes stream data to the video reception unit 123-1 which has requested for the stream data.
The video reception unit 123-1 receives the stream data, decompresses the compressed stream data, displays the decompressed stream data on the monitor 124-1 and, as necessary, records it to a recording unit (not shown). After that, the video reception unit 123-1 sends a request for the next stream data to the video transmission unit 121 via the network 122-1.
The video transmission unit 121 transmits the next stream data to the video reception unit 123-1 which has requested for the stream data. The video reception unit 123-1 receives the next stream data, in a manner similar to the above, decompresses the compressed stream data, displays the decompressed stream data on the monitor 124-1 and, as necessary, records it into a recording unit.
Similarly, each of the other video reception units 123-2 and 123-3 also sequentially sends a transmission request for stream data, receives it and decompresses it.
In the case of the multicast configuration, the number of the compression processing unit 125 in the video transmission unit 121 is one, so that streams transmitted to the video reception units 123-1, 123-2, and 123-3 are the same stream, and the bit rate (compression ratio) of the streams is also the same. Therefore, as long as the bit rate of stream data output from the video transmission unit 121 is not adjusted to that of a network of the lowest transmission speed among the networks 122 to the video reception units 123, a motion picture cannot be decompressed in all of the video reception units 123 which receives the stream.
In the case of FIG. 12, 384 kbps of the third generation portable telephone network is a bottleneck, so that the bit rate of a stream output from the video transmission unit 121 is limited to 384 kbps or lower. Although the video reception unit 123-1 connected to the high-speed network 122-1 can inherently decompress stream data of a high bit rate of about 6 Mbps and output a high-quality picture to the video monitor 124-1, only a low-quality picture of about 384 kbps can be obtained due to the limitation. Similarly, also in the video reception unit 123-2 connected to the network 122-2 of 512 Kbps, only a low-quality picture of about 384 kbps is obtained as a result.
Also in the arrangement of outputting a stream of about 4 Mbps from the video transmission unit 121 in accordance with the high-speed network 122-1, the stream cannot be transmitted at 512 kbps of the ADSL or 384 kbps of the third generation portable telephone network. Consequently, the stream data is not transmitted to the video reception units 123-2 and 123-3 and no motion picture is output.
The picture compressing technique of the MPEG system will be described. MPEG-2 or MPEG-4 picture compression data, that is, stream data is constructed by three kinds of data of an intra picture (hereinbelow, called an I picture), a predictive picture (hereinbelow, called a P picture), and a bidirectionally predictive picture (hereinbelow, called a B picture). The pictures are compressed in three different encoding modes. The I picture is data obtained by coding all of video data of one frame of an analog picture within the frame. Therefore, in the case where an I picture is received, the video reception unit 123 can reproduce the picture only from the one I picture. The P picture is data obtained by performing inter-frame prediction in one direction from the immediate preceding picture data (I picture or P picture) and encoding only the difference data. Therefore, the video reception unit 123 cannot reproduce the picture only from the received P picture without using an I picture. Further, if there is no P picture in some midpoint, an erroneous picture such as a picture in which block distortion occurs is resulted. The B picture is obtained by performing bidirectional inter-frame prediction from data of two pictures of previous and subsequent pictures. Like the P picture, the original B picture cannot be reproduced only from the B picture. Since redundancy in the time base direction with the preceding and subsequent pictures is reduced, the compression data amount of the P and B pictures can be reduced. However, an original picture cannot be reproduced only from the picture itself. An example of a combination of general MPEG-2 pictures is shown as follows.                (I) (B) (B) (P) (B) (B) (P) (B) (B) (P) (B) (B) (P) (B) (B) (I) (B) (B) (P) . . . .        
As described above, in a common configuration, the I picture exists in every 15 pictures.
U.S. Pat. No. 6,157,675 discloses a video transmission apparatus for transmitting a picture to networks of different transfer speeds of the MPEG system. The video transmission apparatus changes the bit rate of coded data without changing the number of pictures to be transmitted. Specifically, the apparatus generates a copy picture obtained by copying a picture of coded data which was transmitted before on the basis of the difference between the bit rate of coded data and a predetermined transmission rate of a transmission line, and transmits the copy picture instead of data to be inherently transmitted, thereby reducing the data amount and transmitting the picture at the predetermined transmission rate.
According to the method, a picture transmitted before is copied. Therefore, the difference between the present picture and the immediately preceding picture is “0” and only data indicative of a copy is sent. Thus, the data amount can be largely reduced and data can be transmitted at a desired transmission rate. However, since a picture to be transmitted is a copy picture, for a picture having motion like a motion picture, a faithful motion picture cannot be transmitted.