Field of the Invention
The present invention relates to a transmission device configured to transmit stream data, a receiving device configured to receive stream data, a method of transmitting stream data, and a method of receiving stream data.
Description of the Related Art
In recent years, network transmission of moving image data is being utilized around the world. With the increasing number of pixels in videos and the increasing variety of utilization modes by users, there is a need for real time transmission of one piece of video data at a resolution and quality that meets the demands of the user by using a Scalable Video Codec (SVC), for example. Note that, SVC, which relates to a technology for hierarchical encoding (time hierarchical encoding) of moving image data relating to frame rate, has been proposed as an extension of H.264/AVC.
Further, Real-time Transport Protocol (RTP) is a leading standard technology for streaming moving image data in real time via the Internet or a local area network (LAN). In RTP, for the protocols of the session layers in a so-called Open Systems Interconnection (OSI) reference model, it is common for User Datagram Protocol (UDP) to be combined with a transport layer. The OSI reference model is a model developed by the International Organization for Standardization (ISO) that defines communication protocols divided into seven layers. Note that, UDP is an effective technology for real time streaming of moving image data. However, when the moving image data is corrupted on the receiving side due to packet loss during transmission, the quality of the obtained video image deteriorates.
Accordingly, in Japanese Patent Application Laid-Open No. 2010-141413, a technology is disclosed for using Forward Error Correction (FEC) during recovery of a lost packet in communication using RTP, which is a higher protocol of UDP. FEC is a technology for simultaneously transmitting, in addition to the packets of moving image data, data packets for restoring moving image data packets for which a transmission error has occurred. Note that, in the following description, the packets of moving image data are referred to as “media packets”, and the data packets for restoring those media packets are referred to as “FEC packets”. The FEC packets are generated by grouping one or more media packets and performing an FEC generation operation on those media packets. Examples of FEC generation operations that are mainly employed include an exclusive or (XOR) operation and a Reed-solomon operation. A method of designating the groups of media packets when generating the FEC packets is defined in “RTP Payload Format for Generic Forward Error Correction” of Request For Comments (RFC) 5109.
Further, as a countermeasure for a case in which more than a permissible level of media packets has been lost, and restoration by FEC cannot be performed, there is a method called hybrid automatic repeat request (ARQ) in which FEC and retransmission control are used together. In Japanese Patent Application Laid-Open No. 2010-141413, the use of transmission and retransmission control of FEC packets together is disclosed. Note that, retransmission control is a technology in which when packet loss has occurred, a request is issued by the receiving side to the transmission side for the lost packet, and the transmission side retransmits the packet to the receiving side. In hybrid ARQ, when the receiving side determines that a lost packet cannot be restored even by FEC, the receiving side issues a request to the transmission source for the packet.
In real time transmission of moving image data, the media packets need to be correctly transmitted, the transmission amount should not be excessive, and the processing from transmission of the media packets until decoding of the video data and the audio data needs to be executed in as short a time as possible. In order to satisfy those demands, hitherto, an appropriate amount of FEC packets is generated and transmitted for each of the video stream, the audio stream, and each media stream accompanying the video and audio streams.
In addition, in the case of SVC, video data is transmitted as a data stream to a base layer and as a data stream to one or more enhancement layers, respectively. For example, the data of the base layer and the enhancement layer(s) is transmitted based on a video resolution demanded by a user. In this case, in consideration of the fact that data in different resolutions is simultaneously being transmitted to a plurality of users, the streams of the base layer and the enhancement layer(s) are each transmitted as separate streams. Thus, in SVC, the data for one moving image is transmitted in a plurality of data streams, but the data amount of each stream is very different. For example, in the case of SVC moving image data when spatial scalability is employed, as the enhancement layer becomes higher from the base layer, the data amount of each layer becomes larger. Note that, the data amount for audio data is generally smaller than the data amount for video data.
However, even if there is a large difference in data amounts depending on the stream, in order to prevent a large increase in the overall transmission amount, the number of FEC packets with respect to the number of media packets cannot be greatly changed. Therefore, for a stream having a smaller data amount than a stream having a large data amount, the media packet groups for FEC generation are wider in a time axis direction. This fact can cause delays in FEC generation, as well as delays in the execution of retransmission requests when hybrid ARQ-based error restoration has failed, and as a result, lead to a deterioration in an ability to transmit in real time. Further, for a lower layer in SVC, because the data amount is small, the groups for FEC generation tend to be formed across a plurality of frames. In this case, error recovery cannot be performed until not only the packets of the frame in which the error occurred have arrived, but also until the packets of the next frame have arrived. This is also a factor in harming the ability to transmit in real time. In addition, in SVC, decoding of a higher layer cannot be completed unless decoding of the lower layer thereof is complete. Therefore, for example, failure or delay in the decoding of the lower layer has an impact on SVC video quality. Thus, in the case of a method such as SVC in which the data is divided into a plurality of layers and stream data is transmitted for each layer, the ability to transmit in real time can be harmed by delays in error recovery.