In recent years, digital media has become a commonplace carrier for delivering information to users. In particular, digital video allows users to obtain information through visual and audio means.
In its most basic form, digital video is composed of a sequence of complete image frames which are played back to the user at a rate of several frames per second. The quality of the video depends on the resolution of each frame, and also the rate at which frames are displayed. Higher resolution means that more detail can be included in each frame whilst higher frame rates improve the user's perception of movement in the video.
Increasing quality of video content results in larger file sizes which is undesirable in many applications. Encoding techniques, and in particular video compression techniques are known which aim to reduce file sizes while minimizing any loss in quality of the video. Video compression techniques generally fall into two groups: spatial compression and temporal compression, with many common video compression formats using a combination of both techniques.
Spatial compression involves applying compression to each individual image frame, for example in a manner similar to JPEG compression for still images.
Temporal compression exploits similarities in sequences of consecutive frames to reduce the information storage requirements. In many videos, significant parts of the scene do not change over time. In this case, the scene information from a previous scene can be re-used for rendering the next scene while only information relating to the changed pixels is stored. This can result in significant reductions in file size. Similarly, where the camera pans across a scene, a significant portion of the new frame is identical to the previous scene but offset in the direction of the pan. In this case only the newly viewable pixels would need to be encoded.
In a video compression such as MPEG-2, complete information frames are called Full Frames or I-frames (Independent frames). These frames are independent of other frames and can therefore be decoded without referring to any information in any other frames of the video. The main compression savings are made by converting the uncompressed video frames into dependent frames. These are frames which are dependent on some information from an adjacent frame in order to be successfully decoded. Dependent frames which are dependent on preceding frames are called Predictive Frames or P-Frames and frames which are dependent on both preceding and following frames are known as B-frames.
Whilst use of I-frames, P-frames and B-frames provides valuable file size savings, temporal compression techniques can inconvenience the user's viewing experience. For example, a user may wish to skip to a specific position in the file and begin playback from that position instead of watching the entire video in order.
If an I-frame is located in the video file at the user's selected position, then playback can begin from the selected position. However, if an I-frame is not present at the desired location, then in most cases, the video decoder will seek to the nearest I-frame location. The user must then wait for the desired segment of the video file to be played.
One known way to address the above problem is to insert more I-frames into the compressed video file. In addition to I-frames located at the scene switching points, I-frames are inserted at regular intervals, for example every second, or every 20 frames so that the granularity of the video segments is improved. However, the presence of more I-frames increases the file size of the video.
The present invention addresses the above problems.