Media applications and systems, for example video editing or streaming of video over the Internet, often require that only a particular time range of the media in the file be used. For example, a user of a video editing program may wish to edit only a particular time range of a video file, or a user may wish to view only the middle portion of a video file. In order to allow this to be done, it is common for media file formats to divide a file into segments corresponding to particular time ranges, and for an index to be provided that declares where the segments can be found in the file.
An example of such a media file format is MPEG-4. The structure of an MPEG-4 file is shown in FIG. 1. The file 10 comprises an index 11, and a plurality of GOPs (“groups of pictures”) 12a, 12b, 12c to 12d. A GOP is a series of images making up a particular sequence of video. The images are compressed, and as can be seen from FIG. 1 this results in the GOPs being of different lengths (i.e. being made up of a different number of bytes). One reason for this is that the video a GOP represents will compress to a different size depending on the nature of the images making up the video; for example, as compression techniques include identifying the differences between images in a series, a series of very similar images will generally be compressed to a much smaller size than a series of images in which differ substantially from each other. The location of a GOP in a file will therefore depend on the size of each preceding GOP. The index 11 provides a mapping from time ranges of video to byte ranges in the file 10, thus allowing the GOP (or GOPs) corresponding to a particular time range of video to be found.
Another example of such a media file format is fragmented MPEG-4, the structure of which as shown in FIG. 2. A file 20 comprises a header 21, a plurality of “moof”s (movie fragments) 22a, 22b, 22c and so on, and a footer 23. Each moof provides a portion of video of a fixed duration, for example two seconds of video. FIG. 2 further shows the internal structure of a moof. A moof comprises a header file 25 and a plurality of GOPs 26a, 26b to 26c. Thus, each moof is much the same as a single MPEG-4 file.
The header 21 contains in XML format details of the moofs in the file 20 and the time ranges for the video they contain; for example, that moofs 22a, 22b and 22c provide video in the time ranges 0-2 seconds, 2-4 seconds and 4-6 second respectively.
The footer 23 contains in XML format details of the byte ranges for the moofs in the file 20. As can be seen in FIG. 2, even though each moof contains a fixed duration of video, the moofs themselves are of variable length. Thus, the byte range for a moof in the file 20 cannot be determined merely from its time range, and the footer 23 is required in order to find a particular moof within the file 20.
The use of such media files in a known system is now described with reference to FIGS. 3 and 4. FIG. 3 is a schematic diagram of a known networked computer system. A file system 31 comprises a data store 32, a file record database 33, and a file system gateway 34. The file system gateway 34 is in communication with an Internet Information Services (IIS) web server 35 (as developed by Microsoft). The IIS web server 35 communicates via the Internet 36 with a personal computer 37 running a video streaming client application 38, in this case a Silverlight application. The IIS web server 35 streams video to the client application 38 using the Smooth Streaming media service.
As is well known, the Smooth Streaming media service provides video in the form of fragmented MPEG-4 files at a quality level appropriate to the bandwidth over which the video is streamed. In essence, video is requested by the client application 38 at the highest quality the bandwidth it has available can support. (Higher quality video will be larger in size, and so will require greater bandwidth.) The client application 38 receives video from the IIS web server 35, which it stores in a buffer. When the buffer contains a sufficient duration of video (i.e. a number of seconds of video), the client application 38 begin to display the video. If the client application 38 finds that the duration of video in the buffer has increased beyond a certain point, this indicates that additional bandwidth is available, and so the client application 38 increases the quality of the video it requests. Conversely, if the amount of video in the buffer falls beyond a certain point, this indicates that insufficient bandwidth is available, and so the client application 38 lowers the quality of the video it requests. In order to provide the differing qualities of video to the client application 38, the IIS web server 35 requires that fragmented MPEG-4 files are available that provide versions of the video being streamed in all the quality levels that may be required, so that it can provide them as and when requested by the client application 38.
FIG. 4 shows a typical use of files of differing qualities by an IIS web server 35 using Smooth Streaming. The IIS web server 35 has files 41 to 45 of qualities 150 Kb/s, 300 Kb/s, 600 Kb/s, 900 Kb/s and 2000 Kb/s respectively. Initially, the client application 38 requests video of quality 150 Kb/s from file 41. When a certain duration of video has been obtained, display of the video will begin. The buffer then continues to receive video from the file 41. As the bandwidth required by this low-quality video is small, the duration of video in the buffer quickly increases (as video is being received at a faster rate than it is being displayed), and once it exceeds a certain point the client application 38 requests the next-highest quality video of quality 300 Kb/s from file 42. This continues as the buffer continues to fill, with at appropriate points the client application 38 requesting video at higher and higher qualities. The duration of the video in the buffer may also fall below a certain point, for example due to a restriction in bandwidth, or because a change in the content of the video causes the same duration of video of the same quality of video to be larger in size (this is explained in more detail below). In this case, the client application 38 requests a lower quality of video. An example of this can be seen in FIG. 4 where the client application 38 is initially requesting the video of quality 2000 Kb/s from file 45, when the duration of video in the buffer falls below a certain point changes to requesting the video of lower quality 900 Kb/s from file 44, and then once the duration of video in the buffer has increased again returns to requesting the video of quality 2000 Kb/s from file 45.
It can be seen that for any particular time segment of video, the client application 38 will only request the time segments from one of the files, with the choice of file being determined by the duration of video currently stored in the buffer. In practice, the smallest segment of a file that the IIS server 35 sends to the client application 8 is a single moof. Thus, for any particular file it is likely that only a small number of the moofs it contains will be required. However, the IIS server 35 requires the footer 33 in order to be able to locate any particular moof in a file. As the footer contains the locations of all the moofs in the file, this requires the locations of all the moofs to be specified before any moofs can be obtained. The location of a moof will depend on the size of each preceding moof, which will depend on how well the video in each moof is compressed. Thus, before any moof can be obtained from any particular file, the size of every moof in the file needs to be known.
In practice, the files of each required quality are generated in advance, so that particular time segments from the files can be provided by the file system 31 to the IIS web server 35 when required. In the common scenario of a web site serving a single video to multiple users (over time and/or at the same time), the overhead of creating the files in advance is not great. This is because there are few files to serve to many recipients. However, in scenarios in which there are many files that may be served to a small number of recipients, the overhead of generating the required files in advance can become extremely significant, and may be overly onerous or even impracticable. An example of this would be where the file system 31 is an archive containing many media files. In this case, files of each required quality would need to be generated for every single file in the archive, even though any particular file may not be viewed at all. (A partial solution to this would be to create the files of differing quality only when a particular file is selected to be viewed, but this would cause a large delay before viewing of a selected file could begin.) Further, there may simply not be sufficient space to store all the required files.
Another situation in which similar disadvantages arise is the editing of video using editing software. A video file in a particular format may be stored in a file system. However, the editing software may use a different file format to edit files; for example the file format used by the editing software may be MPEG-4. When editing a file in MPEG-4 format, the editing software will initially request the index, so that it can locate particular GOPs within the file. This requires the location of each GOP to be declared, which requires the length of each GOP to be known, which in turn depends on how the images in each GOP are compressed. Therefore, even if only a small portion of the video is to be edited, in order to locate the corresponding GOPs within the MPEG-4 file, the entire MPEG-4 file must be generated before editing can begin.
The present invention seeks to solve and/or mitigate the above-mentioned problems. Alternatively or additionally, the present invention seeks to provide improved method and systems for providing file data for media files that can be used with existing media applications and systems.