It has been widespread to distribute large volume content such as a moving image through the Internet with spread of the Internet or high performance of a computer. For example, there is a service called Video On Demand (VOD) which provides content such as a moving image in response to a user requirement. For example, as described in PTL 1, in the VOD, data is transmitted and received between a server (content providing device) and a client (content playback device) using HyperText Transfer Protocol (HTTP).
Here, various techniques have been developed for distribution of the content by HTTP. For example, a Motion Picture Experts Group (MPEG) has promoted an adaptive streaming technique using the HTTP to international standardization as a Dynamic Adaptive Streaming over HTTP (MPEG-DASH) standard.
In the MPEG-DASH, the content is time-divided into a plurality of segments and is transmitted in segment unit. Furthermore, each segment is constituted by one or a plurality of fragments. Furthermore, the content is constituted by one or a plurality of periods and one period includes one or a plurality of segments.
Furthermore, in the MPEG-DASH, a plurality of Representations are prepared in which quality types (types of bit rate, playback quality such as image, data format, and the like) are different for one content. For example, a plurality of segment data encoded at different bit rates are prepared for each segment. Thus, the client who receives the content and performs playback of the content can perform the adaptive streaming by changing the bit rate of the content (segment) to be required in accordance with a reception status of the content and the like.
Furthermore, in the MPEG-DASH, the content is associated with a Media Presentation Description (MPD) and the content is managed by the MPD. The MPD is metadata of the content and is obtained by describing management information of the content in an XML format. In other words, the MPD is information that is used in a case where the client acquires the content and performs playback of the content.
A detailed description example of the MPD will be described with reference to FIG. 10. FIG. 10 is a diagram illustrating a description example of the MPD. As illustrated in FIG. 10, information 210 including type information 211, profile information 212, buffering time information 213, and distribution start time information 214 is described in a MPD 200.
The type information 211 is information (attribute value of an attribute “type”) indicating whether the distribution is live distribution or on-demand distribution. In the illustrated example, the attribute value of the attribute “type” is “dynamic” and it indicates that the content associated with the MPD 200 is the live distribution content. On the other hand, in a case of the on-demand distribution, “static” is described as the attribute value of the attribute “type”.
The profile information 212 is information indicating a profile of the content. Furthermore, the buffering time information 213 is information indicating a minimum buffering time. In the illustrated example, an attribute value of an attribute “minBufferTime” is “PT10S”, and it indicates that the client performs the buffering of at least 10 seconds.
The distribution start time information 214 is information (attribute value of an attribute “availabilityStartTime”) indicating a time when a server starts live streaming distribution of the content. In the illustrated example, the attribute value of the attribute “availabilityStartTime” is “2012-09-20T15:00:00” and it indicates that the live streaming distribution is started at 15 o'clock, Sep. 20, 2012.
Furthermore, period information 220 regarding each period that is obtained by dividing a playback duration of the content is described in the MPD 200. In the illustrated example, as the period information 220, a start time (attribute value of an attribute “start”) of the period with the distribution start time of the content as a standard and a duration (attribute value of an attribute “duration”) of the period are described.
Furthermore, acquisition source information 230 indicating an acquisition source of the content is described. In the illustrated example, as the acquisition source information 230, a URL of the server is described.
Furthermore, here, as the content, high quality Representation in which a bit rate is 1024 kbps and low quality Representation in which the bit rate is 512 kbps are prepared. Thus, in the MPD 200, as a segment contained in a certain period (from 0 second to 3600 seconds of a playback time), high quality segment information 241 indicating a high quality segment and low quality segment information 242 indicating a low quality segment are described. In the high quality segment information 241, an ID and a bit rate of the high quality Representation contained in the period are described. Furthermore, a length and a URL of each segment contained in the period are described. This also applies to the low quality segment information 242. Moreover, the period is constituted by 6 segments of segment #1 to segment #6.
Next, a data structure of basic segment data of the related art will be described with reference to FIG. 11. FIG. 11 is a diagram illustrating the data structure of the segment data of the high quality segment and the low quality segment of the related art. Here, an example in which the segment data is described in a box format that is defined by ISOBFF (ISO/IEC 14496-12) is described.
As illustrated in FIG. 11(a), a high quality segment 260 of the related art is constituted by one Segment Type Box (styp) 261, one Segment Index Box (sidx) 262, and one or a plurality of sets of Movie Fragment Boxes (moof) 263, 265, 267, and 269, and Media Data Boxes (mdat) 264, 266, 268, and 270. Furthermore, as illustrated in FIG. 11(b), similar to the high quality segment 260, a low quality segment 280 is constituted by one styp 281, one sidx 282, and one or a plurality of sets of moof 283, 285, 287, and 289, and mdat 284, 286, 288, and 290.
The styp 261 and the styp 281 are information indicating the type of the segment and/or version information and the like. The sidx 262 and the sidx 282 are information regarding a random access point inside the segment. The moof 263, 265, 267, and 269, the mdat 264, 266, 268, and 270, the moof 283, 285, 287, and 289, and the mdat 284, 286, 288, and 290 are information regarding a fragment constituting the segment.
One set of the moof and the mdat constitutes one fragment. Furthermore, a unit constituted by one or a plurality of fragments and obtained by dividing the segment to adapt to the random access is referred to as a subsegment. For example, in the example of FIG. 11, the moof 263 and 265, and the mdat 264 and 266 constitute one subsegment. A byte size, time information, and the like of each subsegment are described in each of entries 271, 272, 291, and 292 of the sidx (“s0” and “s1” illustrated in FIGS. 11(a) and 11(b)).
Next, an example of a syntax of the sidx will be described with reference to FIG. 12. FIG. 12 is a diagram illustrating an example of the syntax of the sidx. Here, the sidx illustrates an example that is defined by ISO/IEC 14496-12.
As described above, in the MPEG-DASH, it is assumed to reduce the time length of the segment for shortening the time until the distribution can be performed from the generation start of the segment in the server, in order to support low-delay live streaming. Moreover, since processing delay by the server and the client, or delay on a network, and the like occur, it is not possible for the client to perform playback in real time in the strict meaning. Thus, the live streaming in substantially real time with slight delay is referred to as a low-delay live streaming.