Field of the Invention
Embodiments of the present invention relate generally to computer science and, more specifically, to techniques for determining native resolutions of video sequences.
Description of the Related Art Video sequences may be presented in any number of different resolutions.
Typically, the chosen resolution represents tradeoffs between resources required to generate and operate on the video sequence (e.g., camera resolution, processing time, bandwidth, storage, etc.) and visual quality. For example, if the resolution of a video sequence is 1080p, then each frame includes 2,073,600 pixels arranged into 1080 rows and 1920 columns. By contrast, if the resolution of a video sequence is 2160p, then each frame includes 8,294,400 pixels arranged into 2160 rows and 4096 columns. Since the 2160p video sequence includes four times more data than the 1080p video sequence, the visual quality of the 2160p video sequence displayed at the full resolution of 2160p is typically higher than the visual quality of the 1080p video sequence. However, as the resolution of a video sequence increases, storing the video sequence requires more memory, and transferring the video sequence requires more bandwidth. Further, generating and displaying the video sequence at a particular resolution requires equipment capable of supporting the particular resolution.
To reduce the resources required to operate on video sequences and/or comply with resolution limitations of equipment or processes, oftentimes a video sequence may undergo one or more down-sampling operations that reduce the amount of data included in the frames within the sequence. Subsequently, up-sampling operations may be applied to the video sequence for, among other things, compatibility with other video sequences and/or playback equipment. For instance, a video sequence may be up-sampled as part of splicing the video sequence with another video sequence that has been stored at a higher resolution to create a movie. Upon playback via an endpoint consumer device (such as a laptop), the movie may be viewed at the final, higher resolution. However, in general, because down-sampling operations eliminate selected information, subsequent up-sampling operations produce only an approximate reconstruction of the original video sequence. Consequently, if down-sampling and subsequent up-sampling operations have been performed on any portion of a video sequence, then the visual quality of the video sequence is compromised.
For example, to reduce the memory required to store a 2160p video sequence “A”, the video sequence “A” could be down-sampled and then stored at a resolution of 1080p. Subsequently, to include the video sequence “A” in a 2160p movie, the video sequence “A” would need to be up-sampled to a resolution of 2160p. However, because the down-sampling operations would have eliminated selected information in the video sequence “A,” the subsequent up-sampling operations would produce only an approximate reconstruction of the original video sequence “A.” Notably, although the video sequence “A” included in the 2160p movie could be labeled as having a resolution of 2160p, the actual visual quality of the video sequence “A” included in the 2160p movie would be commensurate with an “effective resolution” of 1080p. Consequently, if the movie were displayed at 2160p, then the overall visual quality of the movie would be degraded compared to a true 2160p viewing experience.
As the above example illustrates, as a general matter, the lowest resolution at which a video sequence has been stored (referred to herein as the “native” resolution) determines the highest effective resolution with which the video sequence may be rendered and displayed. Consequently, this “native” resolution is more indicative of the visual quality of the video sequence than the “display” resolution at which the video sequence is delivered.
Furthermore, various operations performed on a video sequence are optimized based on the resolution of the video sequence. For example, efficiently and accurately encoding source data is essential for real-time delivery of video sequences. In operation, encoders are usually configured to make tradeoffs between resources consumed during the encoding/decoding process and visual quality based on the resolution of the video sequence. If an encoder is designed to optimize tradeoffs for a resolution that is higher than the “native” resolution of a video sequence included in a movie having a higher resolution, then the tradeoffs that the encoder may implement for the higher resolution can dramatically increase resource burdens, such as storage and bandwidth usage, when encoding the video sequence without noticeably increasing the visual quality of the video sequence.
Oftentimes, native resolutions of video sequences are unknown or difficult to determine. For example, the distributor of a movie may not be privy to any re-sampling operations that have been performed on any of the video sequences included in the movie. Observing the movie frame-by-frame during playback in an attempt to ascertain any degradation in visual quality associated with re-sampling operations would be prohibitively time consuming. However, unless the native resolution is ascertained, then the problems discussed above cannot be readily addressed.
As the foregoing illustrates, what is needed in the art are more effective techniques for determining the native resolutions of video sequences.