Video creators are increasingly using graphics processing units (“GPUs”) and their graphics memory (e.g., frame buffers) to facilitate streaming of video from an external video source to the graphics memory. GPUs are high-performance three-dimensional (“3D) processors that include 3D graphics pipelines to perform graphics operations, such as transformations, lighting, setup, rendering and the like. An example of one type of external source of video generates video compatible with the Standard Definition Serial Digital Interface (“SD-SDI”) and/or High Definition Serial Digital Interface (“HD-SDI”) standards, as maintained by the Society of Motion Picture and Television Engineers (“SMPTE”). Professional video creators and the television broadcasters use these standards, such as SMPTE 259M, to create high-quality video images. The video is loaded into graphics memory so that it can be either scanned out to a display or captured (i.e., video captured) into a storage medium, such as a disk. Traditionally, video input cards convert the input video stream into a format useable by a central processing unit (“CPU”). While functional, conventional video input cards have several drawbacks in using graphics memory to display and capture the video.
FIG. 1 is a block diagram representative of traditional techniques for both storing video into graphics memory and performing graphics-related operations on the video data to modify the video image. As shown, computing device 100 includes graphics memory 102, a graphics processing unit (“GPU”) 104, a central processing unit (“CPU”), and system memory 120. In operation, video input card 106 receives digitized video (“video in”) 107, such as in SDI format. Converter logic 108 of video input card 106 then converts the SDI format into a format that is suitable for use by CPU 110 and system memory 120. For instance, converter logic 108 converts digitized video 107 from one color space (e.g., YCrCb) to another (e.g., RGB) and converts interlaced video into non-interlaced video.
The traditional technique of storing video in graphics memory includes sending digitized video 107 over a path that includes numerous devices and/or processing steps, each of which adds delay to the transference of video. Sending digitized video 107 over a path having numerous delays requires complex coordination of the video transfer, especially for real-time video. As shown, computing device 100 requires digitized video 107 to traverse path 144 to system memory 120, which includes a number of various layers 150. CPU 110 executes instructions from an application program in application layer 130 to initiate the video transfer to the lower layers. Next, an applications program interface (“API”) layer 132 translates the instructions for transferring digitized video 107 down through the operating system (“O/S”) 134 to a graphics driver 136. To do so, APIs in the API layer 132 communicate with a library 160, which contains precompiled routines for translating commands from the application program and GPU-specific instructions. Note that accessing library 160 adds delay. Graphics driver 136 then provides abstract commands for one or more push buffers 138, each of which provides an interface between software and hardware. As such, graphics driver 136 typically inserts GPU commands and data into push buffer 138 and then initiates transportation of the GPU commands and data via path 142 to graphics memory 102.
There are several drawbacks to transferring data to graphics memory 102 over paths 142 and 144. First, graphics driver 136 inserts digitized video 107 into data frames in a format that is generally not compatible with the native data format associated with the architecture of GPU 104. Incompatibilities with the native data format generally results in inefficiencies since suboptimal amounts of digitized video 107 are usually transferred with the format set by push buffer 138. This decreases throughput and exacerbates delays. Second, computational resources of computing device 100, such as CPU 110 and system memory 120, are integral in facilitating the data transfer. As such, CPU 110 and system memory 120 therefore must allocate their resources to perform the data transfer via layers 150 rather than other tasks. This hinders performance of computing device 100 when performing those other tasks. Third, the translation of video data from application layer 130 to push buffer 138 injects spurious delays that require precise synchronization of the data transfer, especially when digitized video 107 is real-time high-definition video. Moreover, the translation is also dependent on CPU 110 having CPU cycles to devote to the video transfer.
Responsive to execution of an applications program, CPU 110 interacts via path 140 with GPU 104 to access graphics memory 102 when performing a graphics-related operation on the video data. Examples of such graphics operations include color corrections, color conversions (e.g., expand or reduce the color depth), color space conversions, bit reordering (e.g., reordering RGB into BGRA, where “A” indicate “alpha”), alpha filtering, and any other graphics-related operation. One drawback to performing graphics-related operations on digitized video 107 is that computational resources of computing device 100, including CPU 110, are again tasked, in whole or in part, with modifying video images. So if CPU 110 is being tasked to perform a higher priority task, then the graphics-related operation may be delayed. Another drawback is that the performance capabilities of CPU 110 (e.g., operational speed) and/or system memory 120 (e.g., access times) govern the rate at which graphics-related operations occur as well as the rate at which digitized video 107 is transferred. Consequently, other higher priority tasks and the suboptimal capabilities of the hardware can detrimentally influence video being loaded into graphics memory 102.
In view of the foregoing, it would be desirable to provide a GPU video data preprocessor, a computer device, an apparatus and a method that minimizes the above-mentioned drawbacks, thereby facilitating expeditious video transfer to graphics memory for enhancing display and video capture applications, among others.