1. Field of the Invention
The invention pertains to the field of video transmissions. More particularly, the invention pertains to a system and a method for generating video frames.
2. Description of the Related Art
Virtually all applications of video and visual communication deal with large quantities of video data. To create a video presentation, a rendering computer displays a plurality of digital images (xe2x80x9cframesxe2x80x9d) in succession, thereby simulating movement.
Currently, certain technical problems exist relating to transmitting and rendering a video presentation across low bandwidth computer networks. FIG. 1 illustrates a conventional streaming video system 100. In the video system 100, a media server 102 is connected via a network 104 to a rendering computer 106. The media server 102 typically includes one or more video presentations 110 for transmission to the rendering computer 106.
One problem that is encountered in current streaming systems is that the transmission bandwidth between the media server 102 and the rendering computer 106 is not sufficient to support a real-time seamless presentation, such as is provided by a standard television set. To overcome this problem and allow the user to receive the presentation in real-time, the video presentation is often spatially and temporally compressed. Further, to reduce the amount of data that is transmitted, the media server 102 skips selected frames of the presentation, or, alternatively, the video presentation can be developed having only a few frames per second. The resulting presentations, however, are jittery and strobe-like and are simply not as smooth as a presentation that has a higher frame rate.
To increase the rate at which the frames are displayed to a user, a frame generator 112 may be used to provide intermediate frames between two selected reference frames of the video presentation 110. Typically, frame generators fall within one of two categories: linear motion interpolation systems and motion compensated frame interpolation systems. Linear motion interpolation systems superimpose two reference frames of the video presentation 110 to create one or more intermediate frames. Motion compensated frame interpolation systems use motion vectors for frame interpolation.
FIG. 2 illustrates the data format of a frame 200 according to one motion compensated frame interpolation system. The frame 200 of FIG. 2 is divided into nine horizontal groups of blocks (GOB). Each GOB includes eleven macroblocks. Each macroblock has four luminance blocks of 8 pixels by 8 lines followed by two downsampled chrominance blocks (Cb and Cr).
In motion compensated interpolation systems, selected macroblocks are assigned a motion vector based upon a reference frame. FIG. 3 illustrates an exemplary reference frame 300. Usually, the reference frame is the last frame that was transmitted to the rendering computer 106. Each motion vector points to an equivalently sized region in the reference frame that is a good match for the macroblock that is to be transmitted. If a good representation cannot be found, the block is independently coded.
By sending motion vectors that point to regions in the reference frame already transmitted to the rendering computer 106, the media server 102 can transmit a representation of a frame using less data than if the pixel information for each pixel in each block is transmitted.
Although current frame generators increase the frame rate, they are simplistic in design. These systems do not account for certain idiosyncrasies within selected streaming presentations. For example, current frame generators that use motion compensated frame interpolation do not account for video presentations that have textual characters. Often a video image is overlaid with video text to convey additional information to the viewer. If motion compensated frame interpolation generates an intermediate frame having textual characters, the generated frame may inappropriately move the text to a new position, thereby creating some floating text that was not intended by the creator of the video presentation.
Another problem associated with existing frame generators is that they unintelligently perform frame generation regardless of whether such interpolation results in a better quality video presentation. Although frame interpolation does increase the number of frames presented to the viewer, such frame generation can produce strange results under certain circumstances. Some encoders, for example, choose a motion vector for a selected block based only upon the fact that the motion vector references a block that is a good match for the selected block even though there is no actual motion from one corresponding frame to the other. Thus, since all of the vectors do not represent motion, frame generation in these instances should not always be employed.
Additionally, current frame generators do not perform any type of post filtering to the generated frames. As can be readily appreciated, since motion compensated interpolation systems build an intermediate frame using blocks of pixels, i.e., macroblocks, the pixels at the border of each block may not be a close match to the pixels in the neighboring block. Accordingly, the borders of each of the blocks may be readily visible to a viewer of the media presentation.
There is a need for a frame generator that behaves intelligently about the frame generation process. If frame generation would produce anomalous results, frame generation should not be performed. A frame generator should also determine whether the reference frames include textual characters and account for them in the frame generation process. A frame generator should also filter interpolation artifacts from the intermediate frame.
The frame generator of the present invention has several features, no single one of which is solely responsible for its desirable attributes. Without limiting the scope of this invention as expressed by the claims which follow, its more prominent features will now be discussed briefly. After considering this discussion, and particularly after reading the section entitled xe2x80x9cDetailed Description of the Inventionxe2x80x9d one will understand how the features of this invention provide advantages over other frame generators.
One embodiment of the invention includes a method of generating video frames, the method comprising the acts of receiving first data representing a first video frame, the first data comprising a plurality of elements in a memory in the computer system, each element relating to a group of pixels, receiving second data representing a second video frame, the second data comprising a plurality of elements in the memory in the computer system, each element relating to a group of pixels, generating third data representing at least one video frame based upon information from the first and/or second data, and filtering at least a portion of the generated third data by reducing visible discontinuity between adjacent elements in the at least one generated third data.
Another embodiment of the invention includes a system for generating video frames, the system comprising: means for receiving first video frame data in a memory in the computer system, the first video frame data comprising a plurality of elements, each element corresponding to a group of pixels, the first video frame data representing a first video frame, means for receiving second video frame data in the memory in the computer system, the second video frame data comprising a plurality of elements, each element corresponding to a group of pixels, the second video frame data representing a second video frame, means for generating at least one intermediate video frame based upon information from the first video frame data and/or the second video frame data, the at least one intermediate video frame representing at least one selected element at a position intermediate to respective positions whereat the element is represented by the first video frame and the second video frame, and filter means for reducing visible discontinuity between at least two adjacent elements in the at least one generated intermediate video frame.
Another embodiment of the invention includes a video presentation, comprising first frame data in a memory in the computer system, the first frame data representing a first video frame, the first frame data comprising a plurality of elements, each element corresponding to a group of pixels, second frame data in the memory in the computer system, the second frame data representing a second video frame, the second frame data comprising a plurality of elements, each element corresponding to a group of pixels, and intermediate frame data representing an intermediate video frame between the first and second video frames, the intermediate frame data based upon information from the first and second frame data, the intermediate video frame representing at least one selected element at a position intermediate to respective positions whereat the selected element is represented by the first video frame and the second video frame, and wherein at least a portion of the intermediate video frame has been filtered to reduce visible discontinuities between elements.
Yet another embodiment of the invention includes a system for generating video frames, the system comprising a processor, a memory, a decoder running on said processor, said decoder outputting to said memory first digital data representing a first film frame, said decoder outputting to said memory second digital data representing a second film frame, and a frame generator running on said processor, the frame generator inputting said first digital data and said second digital data, the frame generator outputting to said memory intermediate digital data representing an intermediate film frame based upon information within said first and second digital data, said intermediate digital data including identified groups of pixels, said frame generator reducing visible discontinuities near the perimeters of at least one of the groups of pixels included in said intermediate digital data.
Yet another embodiment of the invention includes a program storage device, storing instructions which, when executed, perform the steps comprising: receiving first data representing a first video frame, the first data comprising a plurality of elements in a memory in the computer system, each element relating to a group of pixels, the first data representing a first element at a first position in the first video frame, receiving second data representing a second video frame, the second data comprising a plurality of elements in the memory in the computer system, each element relating to a group of pixels, the second data representing the first element at a second position in the second video frame, generating third data representing an intermediate video frame based upon information from the first and/or second data, the third data representing the first element at a position intermediate to the first and second positions, and filtering at least a portion of the intermediate video frame by reducing visible discontinuity between the first element and an adjoining element.
Yet another embodiment of the invention includes a method of generating frames, the method comprising the acts of: receiving a first frame in a memory in the computer system, the first frame representative of a digital image at a first time, the first frame including a plurality of macroblocks, each of the macroblocks having four quadrants with a plurality of rows and columns of pixels, and each of the pixels having an associated intensity value, receiving a second frame in the memory in the computer system, the second frame representative of the digital image at a second time, the second frame including a plurality of macroblocks, each of the macroblocks having four quadrants with a plurality of rows and columns of pixels, and each of the pixels having an associated intensity value, generating at least one intermediate frame based upon the macroblock quadrants in the first and/or second frames, the at least one intermediate frame representative of an intermediate position of one or more selected macroblock quadrants in the first frame and/or the second frame, determining a filter strength, selectively filtering pixels in the macroblock quadrants based upon the filter strength, determining the average of the pixel intensity of one or more proximately positioned pixels with respect to each of the selected pixels, and associating with each selected pixel the respective determined average pixel intensity.
Yet another embodiment of the invention includes a method of generating frames, the method comprising receiving a first frame having a set of elements, the elements collectively defining a digital image, generating a second frame using the set of elements from the first frame, the second frame representative of the first frame at a point in time either before or after the first frame, the second frame representing at least one of the elements at a position different than the position at which it was represented by the first frame, and filtering the second frame to reduce visible discontinuities in at least one region including adjoining elements.
Yet another embodiment of the invention includes a system for generating video frames, the system comprising a frame analysis module for receiving frames, and a frame synthesis module for generating at least one frame between two received frames, the frame synthesis module filtering the generated frames.