1. Field of the Invention
The invention pertains to the field of video transmissions. More particularly, the invention pertains to a system and a method for generating video frames.
2. Description of the Related Art
Virtually all applications of video and visual communication deal with large quantities of video data. To create a video presentation, a rendering computer displays a plurality of digital images (xe2x80x9cframesxe2x80x9d) in succession, thereby simulating movement.
Currently, certain technical problems exist relating to transmitting and rendering a video presentation across low bandwidth computer networks. FIG. 1 illustrates a conventional streaming video system 100. In the video system 100, a media server 102 is connected via a network 104 to a rendering computer 106. The media server 102 typically includes one or more video presentations 110 for transmission to the rendering computer 106.
One problem that is encountered in current streaming systems is that the transmission bandwidth between the media server 102 and the rendering computer 106 is not sufficient to support a real-time seamless presentation, such as is provided by a standard television set. To overcome this problem and allow the user to receive the presentation in real-time, the video presentation is often spatially and temporally compressed. Further, to reduce the amount of data that is transmitted, the media server 102 skips selected frames of the presentation, or, alternatively, the video presentation can be developed having only a few frames per second. The resulting presentations, however, are jittery and strobe-like and are simply not as smooth as a presentation that has a higher frame rate.
To increase the rate at which the frames are displayed to a user, a frame generator 112 may be used to provide intermediate frames between two selected reference frames of the video presentation 110. Typically, frame generators fall within one of two categories: linear motion interpolation systems and motion compensated frame interpolation systems. Linear motion interpolation systems superimpose two reference frames of the video presentation 110 to create one or more intermediate frames. Motion compensated frame interpolation systems use motion vectors for frame interpolation.
FIG. 2 illustrates the data format of a frame 200 according to one motion compensated frame interpolation system. The frame 200 of FIG. 2 is divided into nine horizontal groups of blocks (GOB). Each GOB includes eleven macroblocks. Each macroblock has four luminance blocks of 8 pixels by 8 lines followed by two downsampled chrominance blocks (Cb and Cr).
In motion compensated interpolation systems, selected macroblocks are assigned a motion vector based upon a reference frame. FIG. 3 illustrates an exemplary reference frame 300. Usually, the reference frame is the last frame that was transmitted to the rendering computer 106. Each motion vector points to an equivalently sized region in the reference frame that is a good match for the macroblock that is to be transmitted. If a good representation cannot be found, the block is independently coded.
By sending motion vectors that point to regions in the reference frame already transmitted to the rendering computer 106, the media server 102 can transmit a representation of a frame using less data than if the pixel information for each pixel in each block is transmitted.
Although current frame generators increase the frame rate, they are simplistic in design. These systems do not account for certain idiosyncrasies within selected streaming presentations. For example, current frame generators that use motion compensated frame interpolation do not account for video presentations that have textual characters. Often a video image is overlaid with video text to convey additional information to the viewer. If motion compensated frame interpolation generates an intermediate frame having textual characters, the generated frame may inappropriately move the text to a new position, thereby creating some floating text that was not intended by the creator of the video presentation.
Another problem associated with existing frame generators is that they unintelligently perform frame generation regardless of whether such interpolation results in a better quality video presentation. Although frame interpolation does increase the number of frames presented to the viewer, such frame generation can produce strange results under certain circumstances. Some encoders, for example, choose a motion vector for a selected block based only upon the fact that the motion vector references a block that is a good match for the selected block even though there is no actual motion from one corresponding frame to the other. Thus, since all of the vectors do not represent motion, frame generation in these instances should not always be employed.
Additionally, current frame generators do not perform any type of post filtering to the generated frames. As can be readily appreciated, since motion compensated interpolation systems build an intermediate frame using blocks of pixels, i.e., macroblocks, the pixels at the border of each block may not be a close match to the pixels in the neighboring block. Accordingly, the borders of each of the blocks may be readily visible to a viewer of the media presentation.
There is a need for a frame generator that behaves intelligently about the frame generation process. If frame generation would produce anomalous results, frame generation should not be performed. A frame generator should also determine whether the reference frames include textual characters and account for them in the frame generation process. A frame generator should also filter interpolation artifacts from the intermediate frame.
The frame generator of the present invention has several features, no single one of which is solely responsible for its desirable attributes. Without limiting the scope of this invention as expressed by the claims which follow, its more prominent features will now be discussed briefly. After considering this discussion, and particularly after reading the section entitled xe2x80x9cDetailed Description of the Inventionxe2x80x9d one will understand how the features of this invention provide advantages over other frame generators.
One embodiment of the invention includes a method of generating video frames, the method comprising receiving first frame data in a memory in the computer system, the first frame data representing a first video frame, the first frame data comprising a plurality of elements, each element corresponding to a group of pixels, receiving second frame data in the memory in the computer system, the second frame data representing a second video frame, the second frame data comprising a plurality of elements, each element corresponding to a group of pixels, determining which elements of the first and the second frames define, at least in part, video text, and generating at least one intermediate frame using the elements of the first frame and/or the second frame.
Another embodiment of the invention includes a system for generating video frames, the system comprising means for receiving a first frame in a memory in the computer system, the first frame representative of a digital image at a first time, the first frame comprising elements corresponding to pixel groups, means for receiving a second frame in the memory in the computer system, the second frame representative of the digital image at a second time, the second frame comprising elements corresponding to pixel groups, means for determining which elements of the first and the second frames define, at least in part, video text, means for generating at least one intermediate frame using the elements of the first frame and/or the second frame, wherein the intermediate frame represents at least one element at a position intermediate to respective positions whereat the at least one element is represented by the first frame and the second frame, and maintaining in the intermediate frame the spatial positioning of the determined textual elements of the first and/or second frames.
Yet another embodiment of the invention includes a method of generating frames having video text, the method comprising receiving a first frame having textual characters, determining whether the first frame includes textual characters, and generating a second frame, the second frame maintaining the positioning of the textual characters in the second frame with respect to the first frame.
Yet another embodiment of the invention includes a program storage device storing instructions which, when executed, perform the steps comprising receiving a first frame in a memory in the computer system, the first frame representative of a digital image at a first time, the first frame comprising elements corresponding to pixel groups, receiving a second frame in the memory in the computer system, the second frame representative of the digital image at a second time, the second frame comprising elements corresponding to pixel groups, determining which elements of the first and the second frame define, at least in part, video text, generating at least one intermediate frame using the elements of the first frame and/or the second frame, the intermediate frame representing at least one element at a position intermediate to respective positions whereat the at least one element is represented by the first frame and the second frame, and maintaining in the intermediate frame the spatial positioning of the elements determined to define video text elements.
Yet another embodiment of the invention includes a method of generating intermediate digital frames, the method comprising the steps of receiving a first frame having a plurality of macroblocks in a memory in the computer system, the first frame representative of an image at a first instance in time, the plurality of macroblocks each having four quadrants, each of the four quadrants, having a plurality of rows and columns of pixels, each of the pixels having an associated intensity value, receiving a second frame having a plurality of macroblocks in the memory in the computer system, the second frame representative of the image at a second instance in time, the plurality of macroblocks each having four quadrants, each of the four quadrants having a plurality of rows and columns of pixels, each of the pixels having an associated intensity value, and determining which macroblock quadrants of the first and the second frame define, at least in part, video text, the determining step comprising identifying a number edges that are within each of the macroblock quadrants, and determining whether the number of edges exceeds a threshold, generating at least one intermediate frame using the macroblock quadrants of the first frame and/or the second video frame, wherein the intermediate frame is representative an intermediate position of one or more selected macroblock quandrants in the first frame and/or the second frame, and maintaining in the at least one intermediate frame the spatial positioning of the macroblock quadrants determined to define video text.
Yet another embodiment of the invention includes a system for generating video frames, the system comprising: a frame analysis module for receiving frames, the frame analysis module detecting the presence of video text within the frames, and a frame synthesis module for generating frames between two received frames, the generated frames including the video text in the same spatial position as in either or both of the received frames.